Table of Contents
Fetching ...

BadLabel: A Robust Perspective on Evaluating and Enhancing Label-noise Learning

Jingfeng Zhang, Bo Song, Haohan Wang, Bo Han, Tongliang Liu, Lei Liu, Masashi Sugiyama

TL;DR

BadLabel introduces a challenging label-noise type by flipping a controlled fraction of labels to maximize loss, with a formal constraint on label flips. The authors demonstrate that existing LNL algorithms are vulnerable to BadLabel and present Robust DivideMix, a three-stage robust framework that uses adversarial label perturbations, BayesGMM-based data division, and MixMatch SSL to recover performance. Empirical results across CIFAR-10/100 and MNIST, including real-world datasets CIFAR-10N and Clothing1M, show that Robust DivideMix achieves superior robustness to BadLabel while remaining competitive on conventional noises. Overall, the work provides a practical stress test for LNL methods and a general methodology for robust learning under challenging, non-boundary-aligned label noise.

Abstract

Label-noise learning (LNL) aims to increase the model's generalization given training data with noisy labels. To facilitate practical LNL algorithms, researchers have proposed different label noise types, ranging from class-conditional to instance-dependent noises. In this paper, we introduce a novel label noise type called BadLabel, which can significantly degrade the performance of existing LNL algorithms by a large margin. BadLabel is crafted based on the label-flipping attack against standard classification, where specific samples are selected and their labels are flipped to other labels so that the loss values of clean and noisy labels become indistinguishable. To address the challenge posed by BadLabel, we further propose a robust LNL method that perturbs the labels in an adversarial manner at each epoch to make the loss values of clean and noisy labels again distinguishable. Once we select a small set of (mostly) clean labeled data, we can apply the techniques of semi-supervised learning to train the model accurately. Empirically, our experimental results demonstrate that existing LNL algorithms are vulnerable to the newly introduced BadLabel noise type, while our proposed robust LNL method can effectively improve the generalization performance of the model under various types of label noise. The new dataset of noisy labels and the source codes of robust LNL algorithms are available at https://github.com/zjfheart/BadLabels.

BadLabel: A Robust Perspective on Evaluating and Enhancing Label-noise Learning

TL;DR

BadLabel introduces a challenging label-noise type by flipping a controlled fraction of labels to maximize loss, with a formal constraint on label flips. The authors demonstrate that existing LNL algorithms are vulnerable to BadLabel and present Robust DivideMix, a three-stage robust framework that uses adversarial label perturbations, BayesGMM-based data division, and MixMatch SSL to recover performance. Empirical results across CIFAR-10/100 and MNIST, including real-world datasets CIFAR-10N and Clothing1M, show that Robust DivideMix achieves superior robustness to BadLabel while remaining competitive on conventional noises. Overall, the work provides a practical stress test for LNL methods and a general methodology for robust learning under challenging, non-boundary-aligned label noise.

Abstract

Label-noise learning (LNL) aims to increase the model's generalization given training data with noisy labels. To facilitate practical LNL algorithms, researchers have proposed different label noise types, ranging from class-conditional to instance-dependent noises. In this paper, we introduce a novel label noise type called BadLabel, which can significantly degrade the performance of existing LNL algorithms by a large margin. BadLabel is crafted based on the label-flipping attack against standard classification, where specific samples are selected and their labels are flipped to other labels so that the loss values of clean and noisy labels become indistinguishable. To address the challenge posed by BadLabel, we further propose a robust LNL method that perturbs the labels in an adversarial manner at each epoch to make the loss values of clean and noisy labels again distinguishable. Once we select a small set of (mostly) clean labeled data, we can apply the techniques of semi-supervised learning to train the model accurately. Empirically, our experimental results demonstrate that existing LNL algorithms are vulnerable to the newly introduced BadLabel noise type, while our proposed robust LNL method can effectively improve the generalization performance of the model under various types of label noise. The new dataset of noisy labels and the source codes of robust LNL algorithms are available at https://github.com/zjfheart/BadLabels.
Paper Structure (25 sections, 10 equations, 10 figures, 11 tables, 1 algorithm)

This paper contains 25 sections, 10 equations, 10 figures, 11 tables, 1 algorithm.

Figures (10)

  • Figure 1: Comparison of different types of label noise: (a) Clean labels, representing a noise-free dataset. (b) Symmetric noise, where the label noise is distributed randomly in each class. (c) Instance-dependent noise, where the label noise is concentrated near the class boundaries. (d) BadLabel, where the label noise is far from the class boundaries. Top row: Synthetic three-class examples. Middle row: Empirical transition matrices of different types of label noise on the CIFAR-10 dataset. Bottom row: Loss distributions of clean and noisy labels of the CIFAR-10 dataset, given a properly trained model.
  • Figure 2: On CIFAR-10 with $40\%$ BadLabel, we visualized the loss distribution of labels before and after label perturbations. After a few epoch warm-up training, (a) before adversarial perturbation of labels, noisy labels tend to have lower loss values; (b) after adversarial perturbation of labels, the noisy labels have larger loss values. In BadLabel, compared with clean labels, the noisy labels are more sensitive to adversarial perturbations. Note that we used hard labels to calculate the loss values despite the label perturbations.
  • Figure 3: Learning curves of several LNL algorithms on CIFAR-10 under varying BadLabel noise ratios. The shaded area represents the error bar corresponding to the standard deviation of Robust DivideMix. Note that, to facilitate a fair comparison of the learning curves, we normalized the learning steps by using uniform sampling, taking into account that different LNL algorithms have different optimal learning schedules.
  • Figure 4: Learning curves of several LNL algorithms on CIFAR-100 under varying BadLabel noise ratios.
  • Figure 5: Learning curves of multiple LNL algorithms on CIFAR-10 with different noise types.
  • ...and 5 more figures