Table of Contents
Fetching ...

Data Debiasing with Datamodels (D3M): Improving Subgroup Robustness via Data Selection

Saachi Jain, Kimia Hamidieh, Kristian Georgiev, Andrew Ilyas, Marzyeh Ghassemi, Aleksander Madry

TL;DR

D3M is introduced, a debiasing approach which isolates and removes specific training examples that drive the model's failures on minority groups and enables to efficiently train debiased classifiers while removing only a small number of examples.

Abstract

Machine learning models can fail on subgroups that are underrepresented during training. While techniques such as dataset balancing can improve performance on underperforming groups, they require access to training group annotations and can end up removing large portions of the dataset. In this paper, we introduce Data Debiasing with Datamodels (D3M), a debiasing approach which isolates and removes specific training examples that drive the model's failures on minority groups. Our approach enables us to efficiently train debiased classifiers while removing only a small number of examples, and does not require training group annotations or additional hyperparameter tuning.

Data Debiasing with Datamodels (D3M): Improving Subgroup Robustness via Data Selection

TL;DR

D3M is introduced, a debiasing approach which isolates and removes specific training examples that drive the model's failures on minority groups and enables to efficiently train debiased classifiers while removing only a small number of examples.

Abstract

Machine learning models can fail on subgroups that are underrepresented during training. While techniques such as dataset balancing can improve performance on underperforming groups, they require access to training group annotations and can end up removing large portions of the dataset. In this paper, we introduce Data Debiasing with Datamodels (D3M), a debiasing approach which isolates and removes specific training examples that drive the model's failures on minority groups. Our approach enables us to efficiently train debiased classifiers while removing only a small number of examples, and does not require training group annotations or additional hyperparameter tuning.
Paper Structure (35 sections, 3 equations, 7 figures, 3 tables)

This paper contains 35 sections, 3 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Our method (D3M) improves worst group accuracy by identifying and removing the training samples which most negatively impact worst-group accuracy. Specifically, we use trakpark2023trak to identify examples that exacerbate the discrepancy in group performance. We then remove and re-train a model on the remaining data.
  • Figure 2: Worst group accuracy on CelebA-Age as a function of the number of examples $k$ removed from the training set, using various removal methods. In green, D3M removes the $k$ training examples with the most negative alignment scores $A_i$. The green star marks the value of $k$ selected by our heuristic ($A_i < 0$). In blue is the performance of a baseline that removes $k$ random examples from the training set, and in orange is dataset balancing, which removes examples randomly from the majority group. Compared to baselines, D3M efficiently improves worst group accuracy.
  • Figure 3: Randomly sampled examples from the subpopulations with the most negative group alignment scores. We find that many of these examples have labeling errors (e.g., platinum blond instead of gray hair.)
  • Figure 4: The average group alignment score of the training examples in each subpopulation of CelebA-Age. Subpopulations such as "old" with "bushy eyebrows" or "young" with "gray hair" have particularly negative scores.
  • Figure 5: For four ImageNet classes, the most extreme (positive or negative) examples according to the top PCA direction of the trak matrix. Our method identifies color and co-occurrence biases.
  • ...and 2 more figures

Theorems & Definitions (1)

  • Remark 1: A note on scalability.