Table of Contents
Fetching ...

Leveraging Large Language Model for Information Retrieval-based Bug Localization

Moumita Asad, Rafed Muhammad Yasir, Sam Malek

TL;DR

GenLoc tackles the vocabulary and metadata limitations of traditional IRBL by marrying semantic code-bug report retrieval with LLM-based iterative code exploration guided by external functions. It leverages embeddings, a vector database, and the ReAct framework to enable the model to reason over code and selectively examine relevant components, yielding superior accuracy and ranking quality on large real-world datasets and recent-bug benchmarks. The approach achieves notable improvements over both traditional IRBL and current LLM-based methods, while maintaining cost-effectiveness and practical runtimes, and demonstrates robustness to unseen bugs. Together with an ablation study and replication resources, GenLoc offers a strong, integrative direction for scalable, context-aware bug localization in real-world software engineering.

Abstract

Information Retrieval-based Bug Localization (IRBL) aims to identify buggy source files for a given bug report. Traditional and deep-learning-based IRBL techniques often suffer from vocabulary mismatch and dependence on project-specific metadata, while recent Large Language Model (LLM)-based approaches are limited by insufficient contextual information. To address these issues, we propose GenLoc, an LLM-based technique that combines semantic retrieval with code-exploration functions to iteratively analyze the code base and identify potential buggy files. We evaluate GenLoc on two diverse datasets: a benchmark of 9,097 bugs from six large open-source projects and the GHRB (GitHub Recent Bugs) dataset of 131 recent bugs across 16 projects. Results demonstrate that GenLoc substantially outperforms traditional IRBL, deep learning approaches and recent LLM-based methods, while also localizing bugs that other techniques fail to detect.

Leveraging Large Language Model for Information Retrieval-based Bug Localization

TL;DR

GenLoc tackles the vocabulary and metadata limitations of traditional IRBL by marrying semantic code-bug report retrieval with LLM-based iterative code exploration guided by external functions. It leverages embeddings, a vector database, and the ReAct framework to enable the model to reason over code and selectively examine relevant components, yielding superior accuracy and ranking quality on large real-world datasets and recent-bug benchmarks. The approach achieves notable improvements over both traditional IRBL and current LLM-based methods, while maintaining cost-effectiveness and practical runtimes, and demonstrates robustness to unseen bugs. Together with an ablation study and replication resources, GenLoc offers a strong, integrative direction for scalable, context-aware bug localization in real-world software engineering.

Abstract

Information Retrieval-based Bug Localization (IRBL) aims to identify buggy source files for a given bug report. Traditional and deep-learning-based IRBL techniques often suffer from vocabulary mismatch and dependence on project-specific metadata, while recent Large Language Model (LLM)-based approaches are limited by insufficient contextual information. To address these issues, we propose GenLoc, an LLM-based technique that combines semantic retrieval with code-exploration functions to iteratively analyze the code base and identify potential buggy files. We evaluate GenLoc on two diverse datasets: a benchmark of 9,097 bugs from six large open-source projects and the GHRB (GitHub Recent Bugs) dataset of 131 recent bugs across 16 projects. Results demonstrate that GenLoc substantially outperforms traditional IRBL, deep learning approaches and recent LLM-based methods, while also localizing bugs that other techniques fail to detect.

Paper Structure

This paper contains 22 sections, 4 equations, 8 figures, 10 tables.

Figures (8)

  • Figure 1: Workflow of GenLoc.
  • Figure 2: LLM Prompt.
  • Figure 3: Bug Report from Birt Project.
  • Figure 4: Overlap Analysis between GenLoc and Non-LLM based IRBL techniques.
  • Figure 5: Overlap Analysis between GenLoc and Recent LLM-based Approaches.
  • ...and 3 more figures