Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation

Louis L. Chen; Bobbie Chern; Eric Eckstrand; Amogh Mahapatra; Johannes O. Royset

Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation

Louis L. Chen, Bobbie Chern, Eric Eckstrand, Amogh Mahapatra, Johannes O. Royset

TL;DR

Label contamination and adversarial perturbations substantially degrade neural network performance. The authors introduce the Rockafellian Relaxation Method (RRM), an architecture-agnostic loss-reweighting meta-algorithm that down-weights contaminated samples by solving a bilevel problem with a total-variation penalty $d_{TV}(p^N,p)$, linking to optimistic Wasserstein DRO. Empirical results across MNIST, CIFAR-10, sentiment analysis, and toxicity/clinical datasets show that RRM and its adversarial extension A-RRM consistently improve robustness and test accuracy, even when contaminated labels are numerous or when losses are already robust. The method requires no clean validation data and can auto-tune to a contamination estimate $C'$, making it practical for large industrial datasets while maintaining favorable computational efficiency due to a tractable inner LP. Overall, RRM offers a flexible, scalable tool to enhance ERM under label noise and adversarial perturbations with broad applicability across domains and architectures.

Abstract

Labeling errors in datasets are common, arising in a variety of contexts, such as human labeling, noisy labeling, and weak labeling (i.e., image classification). Although neural networks (NNs) can tolerate modest amounts of these errors, their performance degrades substantially once error levels exceed a certain threshold. We propose a new loss reweighting, architecture-independent methodology, Rockafellian Relaxation Method (RRM) for neural network training. Experiments indicate RRM can enhance neural network methods to achieve robust performance across classification tasks in computer vision and natural language processing (sentiment analysis). We find that RRM can mitigate the effects of dataset contamination stemming from both (heavy) labeling error and/or adversarial perturbation, demonstrating effectiveness across a variety of data domains and machine learning tasks.

Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation

TL;DR

, linking to optimistic Wasserstein DRO. Empirical results across MNIST, CIFAR-10, sentiment analysis, and toxicity/clinical datasets show that RRM and its adversarial extension A-RRM consistently improve robustness and test accuracy, even when contaminated labels are numerous or when losses are already robust. The method requires no clean validation data and can auto-tune to a contamination estimate

, making it practical for large industrial datasets while maintaining favorable computational efficiency due to a tractable inner LP. Overall, RRM offers a flexible, scalable tool to enhance ERM under label noise and adversarial perturbations with broad applicability across domains and architectures.

Abstract

Paper Structure (28 sections, 6 theorems, 17 equations, 9 tables, 3 algorithms)

This paper contains 28 sections, 6 theorems, 17 equations, 9 tables, 3 algorithms.

Introduction
Related Work
Methodology
Mislabeling
Rockafellian Relaxation Method (RRM)
Analysis and Interpretation of Rockafellian Relaxation
RRM and Optimistic Wasserstein Distributionally Robust Optimization
Loss-reweighting via Data-Driven Wasserstein Formulation
A-RRM/RRM Algorithm
Auto-Tuning: Precise Sample Pruning with a Contamination Estimate $C'$
A-RRM $(\epsilon > 0)$ versus RRM $(\epsilon = 0)$
On Complexity
Datasets
Architectures
Experiments and Results
...and 13 more sections

Key Result

Theorem 3.1

Let $\gamma > 0$ and $c = (c_1, \dots, c_N) \in \mathbb{R}^N$, with $c_{min}:= \min_{i} c_i$, and $c_{max}:= \max_i c_i$. Write $I_{min} := \{i: c_i = c_{min}\}$, $I_{mid}:= \{i: c_i \in (c_{min}, c_{min} + \gamma)\}$, $I_{big} := \{i: c_i = c_{min} + \gamma\}$, and for any $S_1 \subseteq I_{min},$$

Theorems & Definitions (9)

Theorem 3.1
Corollary 3.1.1
Proposition 3.1
Theorem 1.1
proof
Corollary 1.0.1
proof
Proposition 1.1
proof

Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation

TL;DR

Abstract

Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (9)