GIFT: A Framework for Global Interpretable Faithful Textual Explanations of Vision Classifiers
Éloi Zablocki, Valentin Gerard, Amaia Cardiel, Eric Gaussier, Matthieu Cord, Eduardo Valle
TL;DR
GIFT tackles the challenge of producing global, faithful textual explanations for vision classifiers by chaining local, faithful counterfactuals into change captions, aggregating those signals with an LLM to form global hypotheses, and rigorously verifying candidates with causal metrics via image interventions. The four-stage framework—local counterfactuals, change-captioning, global hypothesis generation, and causal verification—yields interpretable explanations that generalize across diverse domains (CLEVR, CelebA, BDD-OIA) and reveal both rules and biases in vision models. The method introduces CaCE and $\hat{\text{PNS}}$ as complementary causal metrics and demonstrates the necessity of stage 2 change-captioning and stage 4 verification for faithful explanations. The work advances practical interpretability in safety-critical settings and provides a reusable pipeline and codebase to broaden applicability and bias/failure analysis in complex vision systems.
Abstract
Understanding deep models is crucial for deploying them in safety-critical applications. We introduce GIFT, a framework for deriving post-hoc, global, interpretable, and faithful textual explanations for vision classifiers. GIFT starts from local faithful visual counterfactual explanations and employs (vision) language models to translate those into global textual explanations. Crucially, GIFT provides a verification stage measuring the causal effect of the proposed explanations on the classifier decision. Through experiments across diverse datasets, including CLEVR, CelebA, and BDD, we demonstrate that GIFT effectively reveals meaningful insights, uncovering tasks, concepts, and biases used by deep vision classifiers. The framework is released at https://github.com/valeoai/GIFT.
