Improving Spectrum-Based Localization of Multiple Faults by Iterative Test Suite Reduction
Dylan Callaghan, Bernd Fischer
TL;DR
This paper tackles the challenge of spectrum-based fault localization deteriorating in multi-fault programs by introducing FLITSR, a purely SBFL-based, iterative test-suite reduction technique that constructs a fault-covering basis and re-ranks elements. FLITSR* extends this approach to yield multiple bases across rounds, mitigating fault masking and dominator effects. Across two large datasets—synthetic multi-fault variants and Defects4J real faults—FLITSR and FLITSR* yield substantial reductions in wasted effort and improved precision/recall, with FLITSR outperforming the state-of-the-art GRACE on method-level real faults. The work demonstrates that FLITSR generalizes across SBFL metrics, offering a practical, scalable MBA solution that enhances multi-fault localization without additional modeling or training.
Abstract
Spectrum-based fault localization (SBFL) works well for single-fault programs but its accuracy decays for increasing fault numbers. We present FLITSR (Fault Localization by Iterative Test Suite Reduction), a novel SBFL extension that improves the localization of a given base metric specifically in the presence of multiple faults. FLITSR iteratively selects reduced versions of the test suite that better localize the individual faults in the system. This allows it to identify and re-rank faults ranked too low by the base metric because they were masked by other program elements. We evaluated FLITSR over method-level spectra from an existing large synthetic dataset comprising 75000 variants of 15 open-source projects with up to 32 injected faults, as well as method-level and statement-level spectra from a new dataset with 326 true multi-fault versions from the Defects4J benchmark set containing up to 14 real faults. For all three spectrum types we consistently see substantial reductions of the average wasted efforts at different fault levels, of 30%-90% over the best base metric, and generally similarly large increases in precision and recall, albeit with larger variance across the underlying projects. For the method-level real faults, FLITSR also substantially outperforms GRACE, a state-of-the-art learning-based fault localizer.
