DD-CAM: Minimal Sufficient Explanations for Vision Models Using Delta Debugging

Krishna Khadka; Yu Lei; Raghu N. Kacker; D. Richard Kuhn

DD-CAM: Minimal Sufficient Explanations for Vision Models Using Delta Debugging

Krishna Khadka, Yu Lei, Raghu N. Kacker, D. Richard Kuhn

TL;DR

DD-CAM introduces a gradient-free framework to produce minimal, sufficient explanations for vision models by identifying a 1-minimal subset of internal representations (feature maps or patch tokens) whose joint activation preserves the model's prediction. By adapting delta debugging, the method finds a subset that is locally necessary and sufficient, yielding focused saliency maps with reduced clutter compared to traditional CAM-based approaches. Extensive experiments across CNNs and ViTs on ImageNet and chest radiographs show improved faithfulness and superior localization, validating both the approach and its applicability to safety-critical domains. The work also provides a practical DD-CAM implementation and discusses its limitations, with potential extensions to model debugging and bias analysis.

Abstract

We introduce a gradient-free framework for identifying minimal, sufficient, and decision-preserving explanations in vision models by isolating the smallest subset of representational units whose joint activation preserves predictions. Unlike existing approaches that aggregate all units, often leading to cluttered saliency maps, our approach, DD-CAM, identifies a 1-minimal subset whose joint activation suffices to preserve the prediction (i.e., removing any unit from the subset alters the prediction). To efficiently isolate minimal sufficient subsets, we adapt delta debugging, a systematic reduction strategy from software debugging, and configure its search strategy based on unit interactions in the classifier head: testing individual units for models with non-interacting units and testing unit combinations for models in which unit interactions exist. We then generate minimal, prediction-preserving saliency maps that highlight only the most essential features. Our experimental evaluation demonstrates that our approach can produce more faithful explanations and achieve higher localization accuracy than the state-of-the-art CAM-based approaches.

DD-CAM: Minimal Sufficient Explanations for Vision Models Using Delta Debugging

TL;DR

Abstract

Paper Structure (24 sections, 3 equations, 2 figures, 2 tables, 1 algorithm)

This paper contains 24 sections, 3 equations, 2 figures, 2 tables, 1 algorithm.

Introduction
Related Work
Model-Agnostic Explanations
Sensitivity-Based Explanations
Class Activation Mapping Explanations
Background
Vision Model Representations
Delta Debugging
Motivation
Approach
Step 1: Activation Extraction
Step 2: Subset Selection via Delta Debugging
Delta Debugging Algorithm
Step 3: Saliency Map Generation
Experimental Design
...and 9 more sections

Figures (2)

Figure 1: Qualitative comparison of saliency maps generated by our approach (DD-CAM) and baseline approaches (Grad-CAM, Grad-CAM++, XGrad-CAM, Layer-CAM, Score-CAM, Ablation-CAM, and Recipro-CAM).
Figure 2: Qualitative localization examples on NIH ChestX-ray14. DD-CAM consistently isolates a single, compact pathological region while baselines vary between diffuse and fragmented responses.

DD-CAM: Minimal Sufficient Explanations for Vision Models Using Delta Debugging

TL;DR

Abstract

DD-CAM: Minimal Sufficient Explanations for Vision Models Using Delta Debugging

Authors

TL;DR

Abstract

Table of Contents

Figures (2)