DiG-IN: Diffusion Guidance for Investigating Networks -- Uncovering Classifier Differences Neuron Visualisations and Visual Counterfactual Explanations
Maximilian Augustin, Yannic Neuhaus, Matthias Hein
TL;DR
DiG-IN addresses core reliability and interpretability challenges of image classifiers by offering a training-free diffusion-guided framework that optimizes latent diffusion inputs to produce realistic images for analysis. It unifies three analysis tasks—classifier disagreement, universal visual counterfactual explanations, and neuron activation visualizations—within a plug-and-play approach that works with any classifier and dataset. The method shows that diffusion-guided VCEs can outperform prior approaches in realism and semantic fidelity, reveals biases such as shape bias and zero-shot CLIP errors, and provides quantitative tools to distinguish core versus spurious neuron features. Collectively, DiG-IN provides a practical, scalable toolkit for debugging and validating vision models in safety-critical or real-world settings.
Abstract
While deep learning has led to huge progress in complex image classification tasks like ImageNet, unexpected failure modes, e.g. via spurious features, call into question how reliably these classifiers work in the wild. Furthermore, for safety-critical tasks the black-box nature of their decisions is problematic, and explanations or at least methods which make decisions plausible are needed urgently. In this paper, we address these problems by generating images that optimize a classifier-derived objective using a framework for guided image generation. We analyze the decisions of image classifiers by visual counterfactual explanations (VCEs), detection of systematic mistakes by analyzing images where classifiers maximally disagree, and visualization of neurons and spurious features. In this way, we validate existing observations, e.g. the shape bias of adversarially robust models, as well as novel failure modes, e.g. systematic errors of zero-shot CLIP classifiers. Moreover, our VCEs outperform previous work while being more versatile.
