Discover and Mitigate Multiple Biased Subgroups in Image Classifiers

Zeliang Zhang; Mingqian Feng; Zhiheng Li; Chenliang Xu

Discover and Mitigate Multiple Biased Subgroups in Image Classifiers

Zeliang Zhang, Mingqian Feng, Zhiheng Li, Chenliang Xu

TL;DR

This paper addresses the challenge of robustness gaps due to multiple unknown biased subgroups in image classifiers. It proposes DIM, a three-stage framework (Decomposition, Interpretation, Mitigation) that uses supervised Partial Least Squares to decompose latent features into subgroup directions guided by training dynamics, interprets these subgroups via CLIP-based retrieval and LLM summarization, and mitigates biases through data-centric and model-centric strategies including soft-label mitigation. Empirical results on CIFAR-100, Breeds, and Hard ImageNet demonstrate improved discovery and identification of biased subgroups and substantial gains in subgroup-robust accuracy compared with baselines. The approach reveals practical failures and spurious correlations in real-world classifiers, enabling more robust and trustworthy image recognition systems, with broader applicability to understanding model bias through interpretation-driven mitigation.

Abstract

Machine learning models can perform well on in-distribution data but often fail on biased subgroups that are underrepresented in the training data, hindering the robustness of models for reliable applications. Such subgroups are typically unknown due to the absence of subgroup labels. Discovering biased subgroups is the key to understanding models' failure modes and further improving models' robustness. Most previous works of subgroup discovery make an implicit assumption that models only underperform on a single biased subgroup, which does not hold on in-the-wild data where multiple biased subgroups exist. In this work, we propose Decomposition, Interpretation, and Mitigation (DIM), a novel method to address a more challenging but also more practical problem of discovering multiple biased subgroups in image classifiers. Our approach decomposes the image features into multiple components that represent multiple subgroups. This decomposition is achieved via a bilinear dimension reduction method, Partial Least Square (PLS), guided by useful supervision from the image classifier. We further interpret the semantic meaning of each subgroup component by generating natural language descriptions using vision-language foundation models. Finally, DIM mitigates multiple biased subgroups simultaneously via two strategies, including the data- and model-centric strategies. Extensive experiments on CIFAR-100 and Breeds datasets demonstrate the effectiveness of DIM in discovering and mitigating multiple biased subgroups. Furthermore, DIM uncovers the failure modes of the classifier on Hard ImageNet, showcasing its broader applicability to understanding model bias in image classifiers. The code is available at https://github.com/ZhangAIPI/DIM.

Discover and Mitigate Multiple Biased Subgroups in Image Classifiers

TL;DR

Abstract

Paper Structure (32 sections, 5 equations, 12 figures, 6 tables)

This paper contains 32 sections, 5 equations, 12 figures, 6 tables.

Introduction
Related Work
Problem Formulation
Method
Decomposition
Interpretation
Mitigation
Experiments
Setup
Evaluation on CIFAR-100
Evaluation on Breeds
Interpreting Hard ImageNet
Ablation study
Conclusion
Implementation Details of DIM
...and 17 more sections

Figures (12)

Figure 1: While the previous method jain2022distilling exploits the SVM to detect the single biased subgroup using the classification correctness on samples, we propose to integrate the training dynamics of biased image classifiers as the supervision into PLS decomposition to discover multiple unknown subgroups. This allows us to further subtly interpret discovered subgroups and precisely mitigate biases.
Figure 2: Overview of DIM method. DIM consists of three stages: Decomposition, Interpretation, and Mitigation. At the decomposition stage, we decompose the image features of "aquatic mammals" into the embedding directions of multiple subgroups. Then, we interpret the discovered subgroup embeddings with text descriptions, e.g., the subgroup "whale" in class "aquatic mammals." At the mitigation stage, we propose data-centric and model-centric strategies to mitigate the subgroup bias to improve the robustness of the image classifier.
Figure 3: The CLIP-Retrieval results of discovered biased subgroup embeddings in the class "large man-made outdoor things." Images in each column come from the same identified subgroup. jain2022distilling inherently discovers two subgroups, positive and negative. Although it successfully detected "Skyscraper $\mid$ Road," it failed to detect the low-performance subgroup. In Domino eyuboglu2021domino, retrieved images from the first subgroup are a mix of "House" and "Castle." Similarly, images from the third subgroup confuse the "Bridge" and "Road."
Figure 4: The CLIP-Retrieval results of discovered biased subgroup embeddings in the class "large natural outdoor scenes." Images in each column come from the same identified subgroup. jain2022distilling inherently discovers two subgroups, positive and negative. Although it successfully detected "Plain $\mid$ Cloud," it failed to detect the low-performance subgroup. In Domino eyuboglu2021domino, retrieved images from the first subgroup are a mix of "Forest" and "Mountain." Similarly, images from the third subgroup confuse the "Plain" and "Cloud."
Figure 5: Example of subgroup interpretation on Hard ImageNet. The first two rows are the retrieval images of identified biased subgroups and corresponding summary descriptions by ChatGPT based on metadata. The last row is from the high-performance subgroup.
...and 7 more figures

Discover and Mitigate Multiple Biased Subgroups in Image Classifiers

TL;DR

Abstract

Discover and Mitigate Multiple Biased Subgroups in Image Classifiers

Authors

TL;DR

Abstract

Table of Contents

Figures (12)