Table of Contents
Fetching ...

Cross-modal Counterfactual Explanations: Uncovering Decision Factors and Dataset Biases in Subjective Classification

Alina Elena Baia, Andrea Cavallaro

TL;DR

This paper introduces DeX, a training-free, cross-modal decompositional framework for generating text-based counterfactual explanations of image privacy decisions. By grounding explanations in image-specific concepts and manipulating multimodal embeddings with cross-modal arithmetic, DeX produces multiple, diverse, and sparsely activated counterfactuals evaluated via a multi-criteria Pareto optimization. The approach not only reveals key decision factors but also uncovers dataset biases, enabling targeted fairness improvements. Empirical results on PrivacyAlert and VISPR demonstrate high validity, proximity, and diversity, with favorable comparisons to existing methods like CounTEX. Overall, DeX offers scalable, interpretable, and image-grounded explanations suitable for high-stakes, subjective classification tasks.

Abstract

Concept-driven counterfactuals explain decisions of classifiers by altering the model predictions through semantic changes. In this paper, we present a novel approach that leverages cross-modal decompositionality and image-specific concepts to create counterfactual scenarios expressed in natural language. We apply the proposed interpretability framework, termed Decompose and Explain (DeX), to the challenging domain of image privacy decisions, which are contextual and subjective. This application enables the quantification of the differential contributions of key scene elements to the model prediction. We identify relevant decision factors via a multi-criterion selection mechanism that considers both image similarity for minimal perturbations and decision confidence to prioritize impactful changes. This approach evaluates and compares diverse explanations, and assesses the interdependency and mutual influence among explanatory properties. By leveraging image-specific concepts, DeX generates image-grounded, sparse explanations, yielding significant improvements over the state of the art. Importantly, DeX operates as a training-free framework, offering high flexibility. Results show that DeX not only uncovers the principal contributing factors influencing subjective decisions, but also identifies underlying dataset biases allowing for targeted mitigation strategies to improve fairness.

Cross-modal Counterfactual Explanations: Uncovering Decision Factors and Dataset Biases in Subjective Classification

TL;DR

This paper introduces DeX, a training-free, cross-modal decompositional framework for generating text-based counterfactual explanations of image privacy decisions. By grounding explanations in image-specific concepts and manipulating multimodal embeddings with cross-modal arithmetic, DeX produces multiple, diverse, and sparsely activated counterfactuals evaluated via a multi-criteria Pareto optimization. The approach not only reveals key decision factors but also uncovers dataset biases, enabling targeted fairness improvements. Empirical results on PrivacyAlert and VISPR demonstrate high validity, proximity, and diversity, with favorable comparisons to existing methods like CounTEX. Overall, DeX offers scalable, interpretable, and image-grounded explanations suitable for high-stakes, subjective classification tasks.

Abstract

Concept-driven counterfactuals explain decisions of classifiers by altering the model predictions through semantic changes. In this paper, we present a novel approach that leverages cross-modal decompositionality and image-specific concepts to create counterfactual scenarios expressed in natural language. We apply the proposed interpretability framework, termed Decompose and Explain (DeX), to the challenging domain of image privacy decisions, which are contextual and subjective. This application enables the quantification of the differential contributions of key scene elements to the model prediction. We identify relevant decision factors via a multi-criterion selection mechanism that considers both image similarity for minimal perturbations and decision confidence to prioritize impactful changes. This approach evaluates and compares diverse explanations, and assesses the interdependency and mutual influence among explanatory properties. By leveraging image-specific concepts, DeX generates image-grounded, sparse explanations, yielding significant improvements over the state of the art. Importantly, DeX operates as a training-free framework, offering high flexibility. Results show that DeX not only uncovers the principal contributing factors influencing subjective decisions, but also identifies underlying dataset biases allowing for targeted mitigation strategies to improve fairness.

Paper Structure

This paper contains 8 sections, 11 equations, 9 figures, 1 table.

Figures (9)

  • Figure 1: Examples of cross-modal arithmetic. Concept addition (left) selectively increases target similarity while preserving overall similarity to others. Concept removal (middle, right) results in a localized reduction in similarity for the target and semantically related concepts. Key- $\bullet$/$\times$: cosine similarity values before/after the arithmetic;/: related/unrelated concepts; : edited concept.
  • Figure 2: DeX generates concept-based explanations for a given image $I$ via a 3-step process: concept extraction and counterfactual scenario creation, cross-modal decomposition for image representation manipulation in the latent space, and multi-criterion selection to identify privacy-relevant scenarios (i.e. explanations).
  • Figure 3: Confidence-proximity Pareto trade-off of counterfactual explanations (top) and the original images (bottom): it shows the interaction between competing criteria and how different concepts influence the model's decision-making.
  • Figure 4: Sample explanations by DeX ( ) and CounTEX ( ). Criteria: confidence ($C$, $\bullet$) and proximity ($P$, $\bullet$) via cosine similarity. Note that the explanations by CounTEX CounTEX ( ) are repetitive and not grounded in the image.
  • Figure 5: Comparison of DeX and CounTEX across validity, $V$, feasibility, $F$, sparsity, $S$, explanation collapse, $R$, proximity via cosine similarity between the original image embeddings and its counterfactual, $P$, and confidence, $C$, on PrivacyAlert PrivacyAlert (left) and VISPR orekondy_68_attributes (right). For visualization, $S$ is scaled to [0,1] and inverted (i.e the higher, the better). For scaling, the maximum value of $S$ is set to 100. For CounTEX, $F$ and $R$ are reported with respect to the top-3 concepts.
  • ...and 4 more figures