Table of Contents
Fetching ...

FLIMs: Fault Localization Interference Mutants, Definition, Recognition and Mitigation

Hengyuan Liu, Zheng Li, Donghua Wang, Yankai Wu, Xiang Chen, Yong Liu

TL;DR

The paper formalizes Fault Localization Interference Mutants (FLIMs) as non-fault mutants killed by failing tests, and analyzes their root causes via an adapted RIPR model to explain their disruptive effect on MBFL. It then presents MBFL-FLIM, a framework that uses LLM-based FLIM recognition (with prompting, domain fine-tuning, and confidence estimation) to mitigate FLIM interference by refining MBFL suspiciousness scores. Through Defects4J-based experiments with multiple LLMs, MBFL-FLIM demonstrates substantial improvements in Top-1/Top-3/Top-5 localization and reduces mean first rank, especially when combining fine-tuning and PCA-based confidence, and it remains robust in multi-fault scenarios. The work argues that semantic analysis of code behavior, combined with targeted mitigation of interference, yields more accurate and efficient fault localization, with practical implications for large-scale automated debugging.

Abstract

Mutation-based Fault Localization (MBFL) has been widely explored for automated software debugging, leveraging artificial mutants to identify faulty code entities. However, MBFL faces significant challenges due to interference mutants generated from non-faulty code entities but can be killed by failing tests. These mutants mimic the test sensitivity behaviors of real faulty code entities and weaken the effectiveness of fault localization. To address this challenge, we introduce the concept of Fault Localization Interference Mutants (FLIMs) and conduct a theoretical analysis based on the Reachability, Infection, Propagation, and Revealability (RIPR) model, identifying four distinct interference causes. Building on this, we propose a novel approach to semantically recognize and mitigate FLIMs using LLM-based semantic analysis, enhanced by fine-tuning techniques and confidence estimation strategies to address LLM output instability. The recognized FLIMs are then mitigated by refining the suspiciousness scores calculated from MBFL techniques. We integrate FLIM recognition and mitigation into the MBFL workflow, developing MBFL-FLIM, a fault localization framework that enhances MBFL's effectiveness by reducing misleading interference while preserving real fault-revealing information. Our empirical experiments on the Defects4J benchmark with 395 program versions using eight LLMs demonstrate MBFL-FLIM's superiority over traditional SBFL and MBFL methods, advanced dynamic feature-based approaches, and recent LLM-based fault localization techniques. Specifically, MBFL-FLIM achieves an average improvement of 44 faults in the Top-1 metric, representing a significant enhancement over baseline methods. Further evaluation confirms MBFL-FLIM's robust performance in multi-fault scenarios, with ablation experiments validating the contributions of the fine-tuning and confidence estimation components.

FLIMs: Fault Localization Interference Mutants, Definition, Recognition and Mitigation

TL;DR

The paper formalizes Fault Localization Interference Mutants (FLIMs) as non-fault mutants killed by failing tests, and analyzes their root causes via an adapted RIPR model to explain their disruptive effect on MBFL. It then presents MBFL-FLIM, a framework that uses LLM-based FLIM recognition (with prompting, domain fine-tuning, and confidence estimation) to mitigate FLIM interference by refining MBFL suspiciousness scores. Through Defects4J-based experiments with multiple LLMs, MBFL-FLIM demonstrates substantial improvements in Top-1/Top-3/Top-5 localization and reduces mean first rank, especially when combining fine-tuning and PCA-based confidence, and it remains robust in multi-fault scenarios. The work argues that semantic analysis of code behavior, combined with targeted mitigation of interference, yields more accurate and efficient fault localization, with practical implications for large-scale automated debugging.

Abstract

Mutation-based Fault Localization (MBFL) has been widely explored for automated software debugging, leveraging artificial mutants to identify faulty code entities. However, MBFL faces significant challenges due to interference mutants generated from non-faulty code entities but can be killed by failing tests. These mutants mimic the test sensitivity behaviors of real faulty code entities and weaken the effectiveness of fault localization. To address this challenge, we introduce the concept of Fault Localization Interference Mutants (FLIMs) and conduct a theoretical analysis based on the Reachability, Infection, Propagation, and Revealability (RIPR) model, identifying four distinct interference causes. Building on this, we propose a novel approach to semantically recognize and mitigate FLIMs using LLM-based semantic analysis, enhanced by fine-tuning techniques and confidence estimation strategies to address LLM output instability. The recognized FLIMs are then mitigated by refining the suspiciousness scores calculated from MBFL techniques. We integrate FLIM recognition and mitigation into the MBFL workflow, developing MBFL-FLIM, a fault localization framework that enhances MBFL's effectiveness by reducing misleading interference while preserving real fault-revealing information. Our empirical experiments on the Defects4J benchmark with 395 program versions using eight LLMs demonstrate MBFL-FLIM's superiority over traditional SBFL and MBFL methods, advanced dynamic feature-based approaches, and recent LLM-based fault localization techniques. Specifically, MBFL-FLIM achieves an average improvement of 44 faults in the Top-1 metric, representing a significant enhancement over baseline methods. Further evaluation confirms MBFL-FLIM's robust performance in multi-fault scenarios, with ablation experiments validating the contributions of the fine-tuning and confidence estimation components.

Paper Structure

This paper contains 58 sections, 1 theorem, 17 equations, 6 figures, 9 tables, 1 algorithm.

Key Result

Corollary 1

For any FLIM $m_c$ that is killed by a failing test $t_f$, the mutation location of $m_c$ must reside on at least one error propagation path from the original fault to the observable failure in $t_f$.

Figures (6)

  • Figure 1: FLIM Recognition Prompt Template
  • Figure 2: Workflow of MBFL-FLIM
  • Figure 3: The EXAM Distribution Comparison between MBFL-FLIM and Baselines
  • Figure 4: The EXAM Distribution Comparison between MBFL-FLIM and Baselines on Multiple Fault Scenario
  • Figure 5: Component Ablation Comparison of MBFL-FLIM on EXAM Metric
  • ...and 1 more figures

Theorems & Definitions (3)

  • Definition 1: FLIMs
  • Corollary 1: FLIMs Location Property
  • Proof 1