Hierarchical Knowledge Injection for Improving LLM-based Program Repair

Ramtin Ehsani; Esteban Parra; Sonia Haiduc; Preetha Chatterjee

Hierarchical Knowledge Injection for Improving LLM-based Program Repair

Ramtin Ehsani, Esteban Parra, Sonia Haiduc, Preetha Chatterjee

TL;DR

This paper tackles the limitation of LLM-based automated program repair (APR) when bug fixes require broader repository and project context. It proposes a layered knowledge-injection framework that progressively augments prompts with Bug, Repository, and Project Knowledge Layers, and evaluates it on 314 BugsInPy bugs using Llama 3.3 and GPT-4o-mini. Results show that each added layer yields incremental gains, with the final Llama 3.3 fix rate reaching $0.79$ (250/314), a substantial improvement over prior work, though certain bug types (Program Anomaly, GUI, Network) remain challenging even after all layers. The study suggests that interactive and adaptive APR systems, potentially agent-driven, are needed to handle complex and isolated bugs, and it provides replication materials and a detailed analysis of how different context levels affect repair across bug types.

Abstract

Prompting LLMs with bug-related context (e.g., error messages, stack traces) improves automated program repair, but many bugs still remain unresolved. In real-world projects, developers often rely on broader repository and project-level context beyond the local code to resolve such bugs. In this paper, we investigate how automatically extracting and providing such knowledge can improve LLM-based program repair. We propose a layered knowledge injection framework that incrementally augments LLMs with structured context. It starts with the Bug Knowledge Layer, which includes information such as the buggy function and failing tests; expands to the Repository Knowledge Layer, which adds structural dependencies, related files, and commit history; and finally injects the Project Knowledge Layer, which incorporates relevant details from documentation and previously fixed bugs. We evaluate this framework on a dataset of 314 bugs from BugsInPy using two LLMs (Llama 3.3 and GPT-4o-mini), and analyze fix rates across six bug types. By progressively injecting knowledge across layers, our approach achieves a fix rate of 79% (250/314) using Llama 3.3, a significant improvement of 23% over previous work. All bug types show improvement with the addition of repository-level context, while only a subset benefit further from project-level knowledge, highlighting that different bug types require different levels of contextual information for effective repair. We also analyze the remaining unresolved bugs and find that more complex and structurally isolated bugs, such as Program Anomaly and GUI bugs, remain difficult even after injecting all available information. Our results show that layered context injection improves program repair and suggest the need for interactive and adaptive APR systems.

Hierarchical Knowledge Injection for Improving LLM-based Program Repair

TL;DR

Abstract

Hierarchical Knowledge Injection for Improving LLM-based Program Repair

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)