Table of Contents
Fetching ...

RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph

Siru Ouyang, Wenhao Yu, Kaixin Ma, Zilin Xiao, Zhihan Zhang, Mengzhao Jia, Jiawei Han, Hongming Zhang, Dong Yu

TL;DR

RepoGraph tackles the gap in repository-level code understanding for AI software engineering by introducing a line-level code graph $\mathcal{G}=\{\mathcal{V},\mathcal{E}\}$, where $\mathcal{V}$ are code lines and edges include $\mathcal{E}_{invoke}$ and $\mathcal{E}_{contain}$. Constructed via AST parsing with tree-sitter, RepoGraph filters project-dependent relations to emphasize meaningful dependencies and enables $k$-hop ego-graph retrieval around search terms for integration with both procedural and agent-based frameworks. Empirical results on SWE-bench-Lite show consistent performance gains across RAG, Agentless, SWE-agent, and AutoCodeRover, often with favorable cost characteristics, and CrossCodeEval transferability confirms generalization to repository-level tasks. Overall, RepoGraph advances AI software engineering by providing a scalable, repository-wide context mechanism, with open-source release and potential extensions to multiple languages and real-time feedback systems for end-to-end automated maintenance and debugging.

Abstract

Large Language Models (LLMs) excel in code generation yet struggle with modern AI software engineering tasks. Unlike traditional function-level or file-level coding tasks, AI software engineering requires not only basic coding proficiency but also advanced skills in managing and interacting with code repositories. However, existing methods often overlook the need for repository-level code understanding, which is crucial for accurately grasping the broader context and developing effective solutions. On this basis, we present RepoGraph, a plug-in module that manages a repository-level structure for modern AI software engineering solutions. RepoGraph offers the desired guidance and serves as a repository-wide navigation for AI software engineers. We evaluate RepoGraph on the SWE-bench by plugging it into four different methods of two lines of approaches, where RepoGraph substantially boosts the performance of all systems, leading to a new state-of-the-art among open-source frameworks. Our analyses also demonstrate the extensibility and flexibility of RepoGraph by testing on another repo-level coding benchmark, CrossCodeEval. Our code is available at https://github.com/ozyyshr/RepoGraph.

RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph

TL;DR

RepoGraph tackles the gap in repository-level code understanding for AI software engineering by introducing a line-level code graph , where are code lines and edges include and . Constructed via AST parsing with tree-sitter, RepoGraph filters project-dependent relations to emphasize meaningful dependencies and enables -hop ego-graph retrieval around search terms for integration with both procedural and agent-based frameworks. Empirical results on SWE-bench-Lite show consistent performance gains across RAG, Agentless, SWE-agent, and AutoCodeRover, often with favorable cost characteristics, and CrossCodeEval transferability confirms generalization to repository-level tasks. Overall, RepoGraph advances AI software engineering by providing a scalable, repository-wide context mechanism, with open-source release and potential extensions to multiple languages and real-time feedback systems for end-to-end automated maintenance and debugging.

Abstract

Large Language Models (LLMs) excel in code generation yet struggle with modern AI software engineering tasks. Unlike traditional function-level or file-level coding tasks, AI software engineering requires not only basic coding proficiency but also advanced skills in managing and interacting with code repositories. However, existing methods often overlook the need for repository-level code understanding, which is crucial for accurately grasping the broader context and developing effective solutions. On this basis, we present RepoGraph, a plug-in module that manages a repository-level structure for modern AI software engineering solutions. RepoGraph offers the desired guidance and serves as a repository-wide navigation for AI software engineers. We evaluate RepoGraph on the SWE-bench by plugging it into four different methods of two lines of approaches, where RepoGraph substantially boosts the performance of all systems, leading to a new state-of-the-art among open-source frameworks. Our analyses also demonstrate the extensibility and flexibility of RepoGraph by testing on another repo-level coding benchmark, CrossCodeEval. Our code is available at https://github.com/ozyyshr/RepoGraph.

Paper Structure

This paper contains 30 sections, 16 figures, 7 tables.

Figures (16)

  • Figure 1: The illustration of (a) a function-level coding problem from HumanEval chen2021codex and (b) a repository-level coding problem from SWE-Bench jimenez2024swebench.
  • Figure 2: An in-depth illustration of (a) the construction, (b) the integration with procedural frameworks, and (c) the integration with agent frameworks of RepoGraph. Given a code repository, we first utilize AST to construct $\mathcal{G}=\{\mathcal{V}, \mathcal{E}\}$, where $\mathcal{G}$ consists of "reference" and "definition" node, $\mathcal{E}$ includes "invoke" and "contain" relations (files and code lines shown in corresponding color). The constructed RepoGraph are then used in procedural frameworks by adding sub-retrieval results into each step, and agent frameworks by adding graph retrieval as an additional action "search_repograph". A simplified version can be found in Figure \ref{['fig: pipeline']}.
  • Figure 3: Venn diagram of RepoGraph and baselines on (a) procedural framework and (b) agent framework on SWE-Bench-Lite. We also plot the error distribution of failing cases against counterparts, e.g., detailed error distribution of $12$ cases RepoGraph succeeds while Agentless fails.
  • Figure 4: Instructions used in the procedural framework to localize to detailed files and code lines of edition.
  • Figure 5: Instruction used for fixing an issue based on the identified locations in certain template.
  • ...and 11 more figures