ExMap: Leveraging Explainability Heatmaps for Unsupervised Group Robustness to Spurious Correlations

Rwiddhi Chakraborty; Adrian Sletten; Michael Kampffmeyer

ExMap: Leveraging Explainability Heatmaps for Unsupervised Group Robustness to Spurious Correlations

Rwiddhi Chakraborty, Adrian Sletten, Michael Kampffmeyer

TL;DR

This work introduces ExMap, an unsupervised two stage mechanism designed to enhance group robustness in traditional classifiers that utilizes a clustering module to infer pseudo-labels based on a model's explainability heatmaps, which are then used during training in lieu of actual labels.

Abstract

Group robustness strategies aim to mitigate learned biases in deep learning models that arise from spurious correlations present in their training datasets. However, most existing methods rely on the access to the label distribution of the groups, which is time-consuming and expensive to obtain. As a result, unsupervised group robustness strategies are sought. Based on the insight that a trained model's classification strategies can be inferred accurately based on explainability heatmaps, we introduce ExMap, an unsupervised two stage mechanism designed to enhance group robustness in traditional classifiers. ExMap utilizes a clustering module to infer pseudo-labels based on a model's explainability heatmaps, which are then used during training in lieu of actual labels. Our empirical studies validate the efficacy of ExMap - We demonstrate that it bridges the performance gap with its supervised counterparts and outperforms existing partially supervised and unsupervised methods. Additionally, ExMap can be seamlessly integrated with existing group robustness learning strategies. Finally, we demonstrate its potential in tackling the emerging issue of multiple shortcut mitigation\footnote{Code available at \url{https://github.com/rwchakra/exmap}}.

ExMap: Leveraging Explainability Heatmaps for Unsupervised Group Robustness to Spurious Correlations

TL;DR

Abstract

Paper Structure (26 sections, 3 equations, 5 figures, 6 tables, 1 algorithm)

This paper contains 26 sections, 3 equations, 5 figures, 6 tables, 1 algorithm.

Introduction
Related Work
Single shortcut mitigation with group labels
Other Strategies for Shortcut Mitigation
Multi-Shortcut Mitigation
Heatmap-based Explainability
Worst Group Robustness
Leveraging Explainability Heatmaps for Group Robustness -- ExMap
Explainability Heatmaps
Clustering
Experiments
Datasets
Baselines
Setup
Results: Single Shortcut
...and 11 more sections

Figures (5)

Figure 1: To improve the original models worst group accuracy, most current approaches rely on supervised group labels (a), which requires extensive annotation processes. Unsupervised approaches have relied on extracting pseudo labels based on the models feature representations (b), where information can be highly entangled. ExMap instead infers group pseudo labels based on explainability heatmaps (c), leading to improved worst group performance.
Figure 2: Our Proposed Method: ExMap facilitates group-robustness by extracting explainability heatmaps from the frozen base ERM model for the validation data (A). These heatmaps are then clustered (B) to obtain pseudo-labels for the underlying groups, which are subsequently chosen for the retraining strategy (C).
Figure 3: The datasets used in our work, visualized with respect to the class labels, and the shortcuts $s$. For the complete list of datasets and more details, please refer to the supplementary material.
Figure 4: ERM and ExMap Heatmaps - Left: The Input images. Middle: ERM model explanations. Right: Improved group robustness using ExMap. Our method helps improve the focus on relevant attributes, in turn improving the pseudo-label estimation for retraining.
Figure 5: ExMap Heatmaps on CelebA: Each entry represents a group. The positive and negative attributions help interpret which features the model considers spurious (Blue), and which features are helpful (Red).

ExMap: Leveraging Explainability Heatmaps for Unsupervised Group Robustness to Spurious Correlations

TL;DR

Abstract

ExMap: Leveraging Explainability Heatmaps for Unsupervised Group Robustness to Spurious Correlations

Authors

TL;DR

Abstract

Table of Contents

Figures (5)