Robustness to Subpopulation Shift with Domain Label Noise via Regularized Annotation of Domains
Nathan Stromberg, Rohan Ayyagari, Monica Welfert, Sanmi Koyejo, Richard Nock, Lalitha Sankar
TL;DR
The paper tackles robustness to subpopulation shifts under domain label noise in last-layer retraining by analyzing the limitations of annotation-based data augmentations and introducing Regularized Annotation of Domains (RAD). It proves that, under noise, downsampling and upweighting yield identical worst-group performance in the population but degrade with noise, while proposing RAD–UW (RAD with upweighting) to achieve state-of-the-art $WGA$ without relying on clean domain labels. Empirically, RAD-UW attains competitive or superior $WGA$ across CMNIST, CelebA, Waterbirds, MultiNLI, and CivilComments, even with as little as $5\%$ domain-label noise, and demonstrates the value of strong $\ell_1$ regularization in pseudo-annotation and retraining. This approach has practical implications for fairness and privacy, enabling robust domain-shift robustness without heavy dependence on potentially noisy or private domain annotations.
Abstract
Existing methods for last layer retraining that aim to optimize worst-group accuracy (WGA) rely heavily on well-annotated groups in the training data. We show, both in theory and practice, that annotation-based data augmentations using either downsampling or upweighting for WGA are susceptible to domain annotation noise, and in high-noise regimes approach the WGA of a model trained with vanilla empirical risk minimization. We introduce Regularized Annotation of Domains (RAD) in order to train robust last layer classifiers without the need for explicit domain annotations. Our results show that RAD is competitive with other recently proposed domain annotation-free techniques. Most importantly, RAD outperforms state-of-the-art annotation-reliant methods even with only 5% noise in the training data for several publicly available datasets.
