Improving Code Localization with Repository Memory
Boshi Wang, Weijian Xu, Yunsheng Li, Mei Gao, Yujia Xie, Huan Sun, Dongdong Chen
TL;DR
This work tackles the lack of long-term memory in repository-level code localization by introducing repository memory built from a project's commit history. It defines two memory stores—episodic memory of past commits and semantic memory of active code functionality—and integrates them with LocAgent to form memory-guided localization workflows. Empirical results on SWE-bench-verified and SWE-bench-live benchmarks show that memory-augmented localization significantly improves accuracy, with combined episodic and semantic memory offering the strongest gains. The findings demonstrate the practical value of long-term, repository-specific memory for expert-like reasoning in software engineering tasks and point to future work on adaptive memory usage and interface design.
Abstract
Code localization is a fundamental challenge in repository-level software engineering tasks such as bug fixing. While existing methods equip language agents with comprehensive tools/interfaces to fetch information from the repository, they overlook the critical aspect of memory, where each instance is typically handled from scratch assuming no prior repository knowledge. In contrast, human developers naturally build long-term repository memory, such as the functionality of key modules and associations between various bug types and their likely fix locations. In this work, we augment language agents with such memory by leveraging a repository's commit history - a rich yet underutilized resource that chronicles the codebase's evolution. We introduce tools that allow the agent to retrieve from a non-parametric memory encompassing recent historical commits and linked issues, as well as functionality summaries of actively evolving parts of the codebase identified via commit patterns. We demonstrate that augmenting such a memory can significantly improve LocAgent, a state-of-the-art localization framework, on both SWE-bench-verified and the more recent SWE-bench-live benchmarks. Our research contributes towards developing agents that can accumulate and leverage past experience for long-horizon tasks, more closely emulating the expertise of human developers.
