Table of Contents
Fetching ...

CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases

Xiangyan Liu, Bo Lan, Zhiyuan Hu, Yang Liu, Zhicheng Zhang, Fei Wang, Michael Shieh, Wenmeng Zhou

TL;DR

CodexGraph tackles the challenge of scaling LLM-assisted code reasoning to entire repositories by introducing a task-agnostic code graph database interface. It builds a Python-centric code graph through static analysis, with a two-phase indexing process to capture intra- and cross-file relationships, and enables LLMs to perform structure-aware retrieval via translated graph queries in an iterative, multi-round pipeline. The approach yields competitive results on repository-level benchmarks (CrossCodeEval, SWE-bench, EvoCodeBench) and demonstrates practical utility through five ModelScope-Agent applications, highlighting improved generalization and task versatility over traditional RACG methods. The work lays groundwork for flexible, scalable integration of LLMs with large codebases, while acknowledging language scope and indexing efficiency as areas for future enhancement.

Abstract

Large Language Models (LLMs) excel in stand-alone code tasks like HumanEval and MBPP, but struggle with handling entire code repositories. This challenge has prompted research on enhancing LLM-codebase interaction at a repository scale. Current solutions rely on similarity-based retrieval or manual tools and APIs, each with notable drawbacks. Similarity-based retrieval often has low recall in complex tasks, while manual tools and APIs are typically task-specific and require expert knowledge, reducing their generalizability across diverse code tasks and real-world applications. To mitigate these limitations, we introduce CodexGraph, a system that integrates LLM agents with graph database interfaces extracted from code repositories. By leveraging the structural properties of graph databases and the flexibility of the graph query language, CodexGraph enables the LLM agent to construct and execute queries, allowing for precise, code structure-aware context retrieval and code navigation. We assess CodexGraph using three benchmarks: CrossCodeEval, SWE-bench, and EvoCodeBench. Additionally, we develop five real-world coding applications. With a unified graph database schema, CodexGraph demonstrates competitive performance and potential in both academic and real-world environments, showcasing its versatility and efficacy in software engineering. Our application demo: https://github.com/modelscope/modelscope-agent/tree/master/apps/codexgraph_agent.

CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases

TL;DR

CodexGraph tackles the challenge of scaling LLM-assisted code reasoning to entire repositories by introducing a task-agnostic code graph database interface. It builds a Python-centric code graph through static analysis, with a two-phase indexing process to capture intra- and cross-file relationships, and enables LLMs to perform structure-aware retrieval via translated graph queries in an iterative, multi-round pipeline. The approach yields competitive results on repository-level benchmarks (CrossCodeEval, SWE-bench, EvoCodeBench) and demonstrates practical utility through five ModelScope-Agent applications, highlighting improved generalization and task versatility over traditional RACG methods. The work lays groundwork for flexible, scalable integration of LLMs with large codebases, while acknowledging language scope and indexing efficiency as areas for future enhancement.

Abstract

Large Language Models (LLMs) excel in stand-alone code tasks like HumanEval and MBPP, but struggle with handling entire code repositories. This challenge has prompted research on enhancing LLM-codebase interaction at a repository scale. Current solutions rely on similarity-based retrieval or manual tools and APIs, each with notable drawbacks. Similarity-based retrieval often has low recall in complex tasks, while manual tools and APIs are typically task-specific and require expert knowledge, reducing their generalizability across diverse code tasks and real-world applications. To mitigate these limitations, we introduce CodexGraph, a system that integrates LLM agents with graph database interfaces extracted from code repositories. By leveraging the structural properties of graph databases and the flexibility of the graph query language, CodexGraph enables the LLM agent to construct and execute queries, allowing for precise, code structure-aware context retrieval and code navigation. We assess CodexGraph using three benchmarks: CrossCodeEval, SWE-bench, and EvoCodeBench. Additionally, we develop five real-world coding applications. With a unified graph database schema, CodexGraph demonstrates competitive performance and potential in both academic and real-world environments, showcasing its versatility and efficacy in software engineering. Our application demo: https://github.com/modelscope/modelscope-agent/tree/master/apps/codexgraph_agent.
Paper Structure (37 sections, 23 figures, 3 tables)

This paper contains 37 sections, 23 figures, 3 tables.

Figures (23)

  • Figure 1: (a) Using a unified schema, CodexGraph employs code graph databases as interfaces that allow LLM agents to interact seamlessly with code repositories. (b) CodexGraph supports the management of a wide range of tasks, from academic-level code benchmarks to real-world software engineering applications.
  • Figure 2: Illustration of the process for indexing source code to generate a code graph based on the given graph database schema. Subfigure (3) provides a visualization example of the resultant code graph in Neo4j.
  • Figure 3: The primary LLM agent analyzes the given code question, writting natural language queries. These queries are then processed by the translation LLM agent, which translates them into executable graph queries.
  • Figure 4: Performance comparison of different querying strategies on CrossCodeEval Lite (Python) and SWE-bench Lite.
  • Figure 5: WebUI for the Code Chat, used for answering any questions related to code repositories.
  • ...and 18 more figures