Table of Contents
Fetching ...

FairDropout: Using Example-Tied Dropout to Enhance Generalization of Minority Groups

Geraldin Nanfack, Eugene Belilovsky

TL;DR

The paper addresses the problem of minority-group generalization under spurious correlations in deep networks trained with empirical risk minimization. It introduces FairDropout, an example-tied dropout that allocates memorizing neurons per example during training and drops them at inference, with tunable probabilities $p_ ext{gen}$ and $p_ ext{mem}$ and flexible placement in large architectures. Across vision, language, and medical benchmarks, FairDropout reduces reliance on spurious features and improves worst-group accuracy without requiring training-time group labels, achieving strong gains on datasets like MultiNLI and MIMIC-CXR. The approach highlights a practical, scalable mechanism to mitigate minority-group overfitting by rechanneling memorization away from decision boundaries. Limitations include potential beneficial memorization in some contexts and the need for further study of memorization-generalization interactions.

Abstract

Deep learning models frequently exploit spurious features in training data to achieve low training error, often resulting in poor generalization when faced with shifted testing distributions. To address this issue, various methods from imbalanced learning, representation learning, and classifier recalibration have been proposed to enhance the robustness of deep neural networks against spurious correlations. In this paper, we observe that models trained with empirical risk minimization tend to generalize well for examples from the majority groups while memorizing instances from minority groups. Building on recent findings that show memorization can be localized to a limited number of neurons, we apply example-tied dropout as a method we term FairDropout, aimed at redirecting this memorization to specific neurons that we subsequently drop out during inference. We empirically evaluate FairDropout using the subpopulation benchmark suite encompassing vision, language, and healthcare tasks, demonstrating that it significantly reduces reliance on spurious correlations, and outperforms state-of-the-art methods.

FairDropout: Using Example-Tied Dropout to Enhance Generalization of Minority Groups

TL;DR

The paper addresses the problem of minority-group generalization under spurious correlations in deep networks trained with empirical risk minimization. It introduces FairDropout, an example-tied dropout that allocates memorizing neurons per example during training and drops them at inference, with tunable probabilities and and flexible placement in large architectures. Across vision, language, and medical benchmarks, FairDropout reduces reliance on spurious features and improves worst-group accuracy without requiring training-time group labels, achieving strong gains on datasets like MultiNLI and MIMIC-CXR. The approach highlights a practical, scalable mechanism to mitigate minority-group overfitting by rechanneling memorization away from decision boundaries. Limitations include potential beneficial memorization in some contexts and the need for further study of memorization-generalization interactions.

Abstract

Deep learning models frequently exploit spurious features in training data to achieve low training error, often resulting in poor generalization when faced with shifted testing distributions. To address this issue, various methods from imbalanced learning, representation learning, and classifier recalibration have been proposed to enhance the robustness of deep neural networks against spurious correlations. In this paper, we observe that models trained with empirical risk minimization tend to generalize well for examples from the majority groups while memorizing instances from minority groups. Building on recent findings that show memorization can be localized to a limited number of neurons, we apply example-tied dropout as a method we term FairDropout, aimed at redirecting this memorization to specific neurons that we subsequently drop out during inference. We empirically evaluate FairDropout using the subpopulation benchmark suite encompassing vision, language, and healthcare tasks, demonstrating that it significantly reduces reliance on spurious correlations, and outperforms state-of-the-art methods.

Paper Structure

This paper contains 18 sections, 2 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Discrepancy in generalization behaviors between majority groups and minority groups on CelebA. Left: average train and test accuracy are plotted. Right: minimum-group train and test accuracy are plotted. We observe that models trained exhibit a large generalization gap on minority groups, a synonym of minority-group overfitting.
  • Figure 2: For each example in a subset of 100 from the minority group and 100 from other groups, we iteratively remove the most critical neurons from a ResNet-50 model trained on the CelebA dataset, until the example’s prediction flips. (a) We observe that minority-group examples require fewer neurons to flip their prediction. (b) After dropping the most critical neurons from examples in different groups, we report the worst-group accuracy on the training set. We find that the worst-group accuracy is consistently less affected when critical neurons from minority-group examples are dropped, compared to those from other groups. This suggests that minority-group examples are being memorized.
  • Figure 3: Effect on the test worst-group accuracy when dropping memorizing neurons as shown in \ref{['fig:analyze_memorization']}. For each example in the minority-group sample, we drop their most critical neurons (memorizing neurons in this case), and report the measured test worst-group accuracy. From the quartiles on this figure, we observe that in $\approx 75\%$, of cases dropping out memorizing neurons improves test worst-group accuracy.
  • Figure 4: Training with FairDropout on CelebA. Left: train/test average are ploted in the testing mode. Center and right: train/test worst-group accuracy with FairDropout are plotted. Training and testing mode respectively refer to the evaluation without dropping memorizing neurons, and after dropping them. We observe that dropping out these memorizing neurons has the benefit of improving worst-group accuracy.
  • Figure 5: Example-Tied Dropout as a FairDropout. The FairDropout redirects the example memorization on specific neurons. Memorizing neurons are uniformly allocated to training examples during training. During testing, these memorizing neurons are dropped.