Table of Contents
Fetching ...

Enhancing repository-level software repair via repository-aware knowledge graphs

Boyang Yang, Jiadong Ren, Shunfu Jin, Yang Liu, Feng Liu, Bach Le, Haoye Tian

TL;DR

KGCompass couples a repository-aware knowledge graph with path-guided prompting to bridge the semantic gap between issue reports and code patches in repository-level software repair. By mining a KG that links issues, PRs, files, classes, and functions, it narrows the candidate fault locations to 20 with interpretability, enabling a single LLM to generate accurate patches at about $0.20$ per repair. Extensive SWE-bench Lite experiments show state-of-the-art single-LLM repair performance and strong fault-location accuracy, with substantial gains over pure-LLM baselines across multiple backbones. The approach demonstrates that graph-guided context and multi-hop reasoning substantially improve repair quality while offering transparent reasoning and cost efficiency, establishing a new baseline for repository-level repair. Future work could extend the KG to richer domain knowledge and apply path-guided reasoning to other NL-to-code tasks.

Abstract

Repository-level software repair faces challenges in bridging semantic gaps between issue descriptions and code patches. Existing approaches, which primarily rely on large language models (LLMs), are hindered by semantic ambiguities, limited understanding of structural context, and insufficient reasoning capabilities. To address these limitations, we propose KGCompass with two innovations: (1) a novel repository-aware knowledge graph (KG) that accurately links repository artifacts (issues and pull requests) and codebase entities (files, classes, and functions), allowing us to effectively narrow down the vast search space to only 20 most relevant functions with accurate candidate fault locations and contextual information, and (2) a path-guided repair mechanism that leverages KG-mined entity paths, tracing through which allows us to augment LLMs with relevant contextual information to generate precise patches along with their explanations. Experimental results in the SWE-bench Lite demonstrate that KGCompass achieves state-of-the-art single-LLM repair performance (58.3%) and function-level fault location accuracy (56.0%) across open-source approaches with a single repair model, costing only $0.2 per repair. Among the bugs that KGCompass successfully localizes, 89.7% lack explicit location hints in the issue and are found only through multi-hop graph traversal, where pure LLMs struggle to locate bugs accurately. Relative to pure-LLM baselines, KGCompass lifts the resolved rate by 50.8% on Claude-4 Sonnet, 30.2% on Claude-3.5 Sonnet, 115.7% on DeepSeek-V3, and 156.4% on Qwen2.5 Max. These consistent improvements demonstrate that this graph-guided repair framework delivers model-agnostic, cost-efficient repair and sets a strong new baseline for repository-level repair.

Enhancing repository-level software repair via repository-aware knowledge graphs

TL;DR

KGCompass couples a repository-aware knowledge graph with path-guided prompting to bridge the semantic gap between issue reports and code patches in repository-level software repair. By mining a KG that links issues, PRs, files, classes, and functions, it narrows the candidate fault locations to 20 with interpretability, enabling a single LLM to generate accurate patches at about per repair. Extensive SWE-bench Lite experiments show state-of-the-art single-LLM repair performance and strong fault-location accuracy, with substantial gains over pure-LLM baselines across multiple backbones. The approach demonstrates that graph-guided context and multi-hop reasoning substantially improve repair quality while offering transparent reasoning and cost efficiency, establishing a new baseline for repository-level repair. Future work could extend the KG to richer domain knowledge and apply path-guided reasoning to other NL-to-code tasks.

Abstract

Repository-level software repair faces challenges in bridging semantic gaps between issue descriptions and code patches. Existing approaches, which primarily rely on large language models (LLMs), are hindered by semantic ambiguities, limited understanding of structural context, and insufficient reasoning capabilities. To address these limitations, we propose KGCompass with two innovations: (1) a novel repository-aware knowledge graph (KG) that accurately links repository artifacts (issues and pull requests) and codebase entities (files, classes, and functions), allowing us to effectively narrow down the vast search space to only 20 most relevant functions with accurate candidate fault locations and contextual information, and (2) a path-guided repair mechanism that leverages KG-mined entity paths, tracing through which allows us to augment LLMs with relevant contextual information to generate precise patches along with their explanations. Experimental results in the SWE-bench Lite demonstrate that KGCompass achieves state-of-the-art single-LLM repair performance (58.3%) and function-level fault location accuracy (56.0%) across open-source approaches with a single repair model, costing only $0.2 per repair. Among the bugs that KGCompass successfully localizes, 89.7% lack explicit location hints in the issue and are found only through multi-hop graph traversal, where pure LLMs struggle to locate bugs accurately. Relative to pure-LLM baselines, KGCompass lifts the resolved rate by 50.8% on Claude-4 Sonnet, 30.2% on Claude-3.5 Sonnet, 115.7% on DeepSeek-V3, and 156.4% on Qwen2.5 Max. These consistent improvements demonstrate that this graph-guided repair framework delivers model-agnostic, cost-efficient repair and sets a strong new baseline for repository-level repair.

Paper Structure

This paper contains 25 sections, 4 equations, 10 figures, 7 tables, 1 algorithm.

Figures (10)

  • Figure 1: Motivating Example of KG-based Fault Location
  • Figure 2: Motivating Example of KG-guided Repair
  • Figure 3: Overview of KGCompass
  • Figure 4: LLM Prompt Template for Fault Location
  • Figure 5: KG-mined Relevant Function Format with Entity Path for Bug Repair
  • ...and 5 more figures