Label Noise Robustness for Domain-Agnostic Fair Corrections via Nearest Neighbors Label Spreading

Nathan Stromberg; Rohan Ayyagari; Sanmi Koyejo; Richard Nock; Lalitha Sankar

Label Noise Robustness for Domain-Agnostic Fair Corrections via Nearest Neighbors Label Spreading

Nathan Stromberg, Rohan Ayyagari, Sanmi Koyejo, Richard Nock, Lalitha Sankar

TL;DR

This work tackles the problem of maximizing worst-group accuracy under symmetric label noise by making last-layer fairness corrections domain-agnostic. It introduces a plug-in preprocessing step: kNN label spreading in the latent embedding space to denoise labels, followed by existing two-stage last-layer corrections (RAD or SELF). The approach demonstrates state-of-the-art worst-group accuracy across several datasets under varying noise levels while adding minimal computational overhead. Key insights include the importance of embedding separability, the need to adapt the neighbor count to noise level, and the potential to extend domain-agnostic fairness corrections without domain annotations. Overall, the method offers a practical, scalable route to robust subgroup fairness in the presence of label noise.

Abstract

Last-layer retraining methods have emerged as an efficient framework for correcting existing base models. Within this framework, several methods have been proposed to deal with correcting models for subgroup fairness with and without group membership information. Importantly, prior work has demonstrated that many methods are susceptible to noisy labels. To this end, we propose a drop-in correction for label noise in last-layer retraining, and demonstrate that it achieves state-of-the-art worst-group accuracy for a broad range of symmetric label noise and across a wide variety of datasets exhibiting spurious correlations. Our proposed approach uses label spreading on a latent nearest neighbors graph and has minimal computational overhead compared to existing methods.

Label Noise Robustness for Domain-Agnostic Fair Corrections via Nearest Neighbors Label Spreading

TL;DR

Abstract

Paper Structure (22 sections, 1 theorem, 7 equations, 3 figures, 9 tables, 4 algorithms)

This paper contains 22 sections, 1 theorem, 7 equations, 3 figures, 9 tables, 4 algorithms.

Introduction
Related Work
Problem Setup
Label Noise Model
Basic Last-Layer Model Corrections for WGA
Two-Stage Last-Layer Model Correction
Label Spreading for Robust Worst-Group Accuracy
A Note on Domain Label Spreading
kNN-RAD and kNN-SELF
Experiments
Experimental Details
Empirical Evidence for Label Spreading
Results
Discussion and Limitations
Experimental Details
...and 7 more sections

Key Result

Proposition 1

For $k\ge8$ and symmetric label noise level $p$ where $\mathcal{R}^*$ is the Bayes optimal risk, $\mathcal{R}_k$ is the risk of kNN, $d$ is the data feature dimensions, and $L$ is the Lipschitz constant of the Bayes optimal classifier.

Figures (3)

Figure 1: Accuracy (and 95% confidence intervals over 10 runs) of predicted labels from kNN under 20% symmetric label noise. CelebA and Waterbirds achieve strong performance with a large number of nearest neighbors, but CMNIST struggles as the number of neighbors or rounds grows too large.
Figure 2: tSNE projection of the 2048 dimensional latent embeddings into a 2 dimensional space for visualization. We see that CelebA and Waterbirds show clear class separation while CMNIST has more hierarchical clustering. This could lead to decreased performance of label spreading.
Figure 3: RAD trained with $\alpha$-loss is able to capture minority points at all noise levels, but an increasing number of noisy majority points are selected as noise increases. This leads to poor downstream fairness

Theorems & Definitions (1)

Proposition 1: Theorem 2 from Gao_Yang_Zhou_2018

Label Noise Robustness for Domain-Agnostic Fair Corrections via Nearest Neighbors Label Spreading

TL;DR

Abstract

Label Noise Robustness for Domain-Agnostic Fair Corrections via Nearest Neighbors Label Spreading

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (1)