CodeRefine: A Pipeline for Enhancing LLM-Generated Code Implementations of Research Papers
Ekaterina Trofimova, Emil Sataev, Abhijit Singh Jowhari
TL;DR
CodeRefine targets the problem of converting methodological descriptions in scientific papers into executable code by combining LLM-driven processing with a knowledge-graph ontology and a retrospective retrieval-augmented generation framework. The pipeline segments papers into text chunks, filters for code-relevant content, builds a knowledge graph, generates intermediate code with GPT-4o, and refines it via RRAG using task-aware vector embeddings; final outputs are evaluated against ground-truth code using a Tree-based Structural Edit Distance metric defined as $TSED = \max\{1- \frac{TED}{MaxNodes(T_1, T_2)}, 0\}$. Experiments on five papers show that RRAG, when supplied with the paper and its references in a dynamic database, improves code similarity compared to vanilla prompting, though penalty weights for code edits are paper-dependent and not universal. This approach offers a practical step toward reliable automated code synthesis from scientific text, with potential impact on accelerating the adoption of cutting-edge algorithms and guiding future tool development in research workflows.
Abstract
This paper presents CodeRefine, a novel framework for automatically transforming research paper methodologies into functional code using Large Language Models (LLMs). Our multi-step approach first extracts and summarizes key text chunks from papers, analyzes their code relevance, and creates a knowledge graph using a predefined ontology. Code is then generated from this structured representation and enhanced through a proposed retrospective retrieval-augmented generation approach. CodeRefine addresses the challenge of bridging theoretical research and practical implementation, offering a more accurate alternative to LLM zero-shot prompting. Evaluations on diverse scientific papers demonstrate CodeRefine's ability to improve code implementation from the paper, potentially accelerating the adoption of cutting-edge algorithms in real-world applications.
