Table of Contents
Fetching ...

Improved IR-based Bug Localization with Intelligent Relevance Feedback

Asif Mohammed Samir, Mohammad Masudur Rahman

TL;DR

BRaIn addresses the contextual gap between bug reports and code in IR-based bug localization by leveraging Large Language Models for Intelligent Relevance Feedback. It combines BM25-based retrieval with LLM-driven relevance assessment, code-segment analysis via JavaParser, PageRank-based term selection, and query expansion to re-rank and rescore candidate documents. Evaluated on the Bench4BL dataset with 4,683 bug reports, BRaIn significantly outperforms baselines across MAP, MRR, and HIT@K, and locales a sizable fraction of low-quality bug reports that baseline methods miss. This approach demonstrates the practical value of coupling IR with contextual, model-driven reasoning for scalable and more accurate bug localization.

Abstract

Software bugs pose a significant challenge during development and maintenance, and practitioners spend nearly 50% of their time dealing with bugs. Many existing techniques adopt Information Retrieval (IR) to localize a reported bug using textual and semantic relevance between bug reports and source code. However, they often struggle to bridge a critical gap between bug reports and code that requires in-depth contextual understanding, which goes beyond textual or semantic relevance. In this paper, we present a novel technique for bug localization - BRaIn - that addresses the contextual gaps by assessing the relevance between bug reports and code with Large Language Models (LLM). It then leverages the LLM's feedback (a.k.a., Intelligent Relevance Feedback) to reformulate queries and re-rank source documents, improving bug localization. We evaluate BRaIn using a benchmark dataset, Bench4BL, and three performance metrics and compare it against six baseline techniques from the literature. Our experimental results show that BRaIn outperforms baselines by 87.6%, 89.5%, and 48.8% margins in MAP, MRR, and HIT@K, respectively. Additionally, it can localize approximately 52% of bugs that cannot be localized by the baseline techniques due to the poor quality of corresponding bug reports. By addressing the contextual gaps and introducing Intelligent Relevance Feedback, BRaIn advances not only theory but also improves IR-based bug localization.

Improved IR-based Bug Localization with Intelligent Relevance Feedback

TL;DR

BRaIn addresses the contextual gap between bug reports and code in IR-based bug localization by leveraging Large Language Models for Intelligent Relevance Feedback. It combines BM25-based retrieval with LLM-driven relevance assessment, code-segment analysis via JavaParser, PageRank-based term selection, and query expansion to re-rank and rescore candidate documents. Evaluated on the Bench4BL dataset with 4,683 bug reports, BRaIn significantly outperforms baselines across MAP, MRR, and HIT@K, and locales a sizable fraction of low-quality bug reports that baseline methods miss. This approach demonstrates the practical value of coupling IR with contextual, model-driven reasoning for scalable and more accurate bug localization.

Abstract

Software bugs pose a significant challenge during development and maintenance, and practitioners spend nearly 50% of their time dealing with bugs. Many existing techniques adopt Information Retrieval (IR) to localize a reported bug using textual and semantic relevance between bug reports and source code. However, they often struggle to bridge a critical gap between bug reports and code that requires in-depth contextual understanding, which goes beyond textual or semantic relevance. In this paper, we present a novel technique for bug localization - BRaIn - that addresses the contextual gaps by assessing the relevance between bug reports and code with Large Language Models (LLM). It then leverages the LLM's feedback (a.k.a., Intelligent Relevance Feedback) to reformulate queries and re-rank source documents, improving bug localization. We evaluate BRaIn using a benchmark dataset, Bench4BL, and three performance metrics and compare it against six baseline techniques from the literature. Our experimental results show that BRaIn outperforms baselines by 87.6%, 89.5%, and 48.8% margins in MAP, MRR, and HIT@K, respectively. Additionally, it can localize approximately 52% of bugs that cannot be localized by the baseline techniques due to the poor quality of corresponding bug reports. By addressing the contextual gaps and introducing Intelligent Relevance Feedback, BRaIn advances not only theory but also improves IR-based bug localization.
Paper Structure (22 sections, 5 equations, 4 figures, 12 tables)

This paper contains 22 sections, 5 equations, 4 figures, 12 tables.

Figures (4)

  • Figure 1: Buggy Code with Diff
  • Figure 2: Schematic Diagram of BRaIn: (A) Document Indexing & Retrieval, (B) Intelligent Relevance Feedback, (C) Query Expansion, and (D) Bug Localization
  • Figure 3: Performance of BRaIn with Low Quality Bug Reports
  • Figure 4: Rank Improvement: BRaIn vs Blizzard