When Stability meets Sufficiency: Informative Explanations that do not Overwhelm
Ronny Luss, Amit Dhurandhar
TL;DR
The paper introduces the Path-Sufficient Explanations Method (PSEM), a framework for generating a stable sequence of explanations that progressively reduce features from the original input toward a minimally sufficient subset while preserving the predicted class. By enforcing monotonicity and bounded perturbations between consecutive explanations and combining fidelity with sparsity in an alternating minimization scheme, PSEM yields explanations that are both stable and informative across image, text, and tabular data. Compared to CEM-PP, LIME, and ALIME, PSEM demonstrates superior path stability and fidelity, with qualitative results showing more realistic, interpretable explanations; a user study further confirms the method’s effectiveness in helping users understand local model behavior. The work highlights the importance of stable, actionable explanations for building trust in XAI and outlines future directions, including multiple-path learning and extensions to semi-factual explanations. Overall, PSEM provides a principled, executable path-based approach to local explanations that improve interpretability without overwhelming users, with demonstrated cross-modality applicability and empirical user validation.
Abstract
Recent studies evaluating various criteria for explainable artificial intelligence (XAI) suggest that fidelity, stability, and comprehensibility are among the most important metrics considered by users of AI across a diverse collection of usage contexts. We consider these criteria as applied to feature-based attribution methods, which are amongst the most prevalent in XAI literature. Going beyond standard correlation, methods have been proposed that highlight what should be minimally sufficient to justify the classification of an input (viz. pertinent positives). While minimal sufficiency is an attractive property akin to comprehensibility, the resulting explanations are often too sparse for a human to understand and evaluate the local behavior of the model. To overcome these limitations, we incorporate the criteria of stability and fidelity and propose a novel method called Path-Sufficient Explanations Method (PSEM) that outputs a sequence of stable and sufficient explanations for a given input of strictly decreasing size (or value) -- from original input to a minimally sufficient explanation -- which can be thought to trace the local boundary of the model in a stable manner, thus providing better intuition about the local model behavior for the specific input. We validate these claims, both qualitatively and quantitatively, with experiments that show the benefit of PSEM across three modalities (image, tabular and text) as well as versus other path explanations. A user study depicts the strength of the method in communicating the local behavior, where (many) users are able to correctly determine the prediction made by a model.
