Table of Contents
Fetching ...

Variation-Bounded Loss for Noise-Tolerant Learning

Jialiang Wang, Xiong Zhou, Xianming Liu, Gangfeng Hu, Deming Zhai, Junjun Jiang, Haoliang Li

TL;DR

The paper tackles robustness to noisy labels in supervised learning by introducing the Variation Ratio $v(L)$ as a fundamental property of loss functions and proposing Variation-Bounded Loss (VBL) with finite $v(L)$. The authors develop theoretical results showing that smaller $v(L)$ yields tighter excess-risk bounds under symmetric and certain asymmetric noises, and they establish a practical path from the variation ratio to asymmetric conditions. They formalize and analyze how $v(L)$ relaxes the symmetric condition and enables asymmetry, and they present three concrete variation-bounded losses (VCE, VEL, VSL) with tunable parameters. Empirically, VBL variants, including combinations with Normalized Cross Entropy (NCE), achieve strong performance across CIFAR benchmarks and real-world noisy datasets such as WebVision, ILSVRC12, and Clothing1M, while also providing improved feature representations under label noise. The work offers a compact, effective framework for designing robust losses with broad applicability to noisy-label scenarios.

Abstract

Mitigating the negative impact of noisy labels has been aperennial issue in supervised learning. Robust loss functions have emerged as a prevalent solution to this problem. In this work, we introduce the Variation Ratio as a novel property related to the robustness of loss functions, and propose a new family of robust loss functions, termed Variation-Bounded Loss (VBL), which is characterized by a bounded variation ratio. We provide theoretical analyses of the variation ratio, proving that a smaller variation ratio would lead to better robustness. Furthermore, we reveal that the variation ratio provides a feasible method to relax the symmetric condition and offers a more concise path to achieve the asymmetric condition. Based on the variation ratio, we reformulate several commonly used loss functions into a variation-bounded form for practical applications. Positive experiments on various datasets exhibit the effectiveness and flexibility of our approach.

Variation-Bounded Loss for Noise-Tolerant Learning

TL;DR

The paper tackles robustness to noisy labels in supervised learning by introducing the Variation Ratio as a fundamental property of loss functions and proposing Variation-Bounded Loss (VBL) with finite . The authors develop theoretical results showing that smaller yields tighter excess-risk bounds under symmetric and certain asymmetric noises, and they establish a practical path from the variation ratio to asymmetric conditions. They formalize and analyze how relaxes the symmetric condition and enables asymmetry, and they present three concrete variation-bounded losses (VCE, VEL, VSL) with tunable parameters. Empirically, VBL variants, including combinations with Normalized Cross Entropy (NCE), achieve strong performance across CIFAR benchmarks and real-world noisy datasets such as WebVision, ILSVRC12, and Clothing1M, while also providing improved feature representations under label noise. The work offers a compact, effective framework for designing robust losses with broad applicability to noisy-label scenarios.

Abstract

Mitigating the negative impact of noisy labels has been aperennial issue in supervised learning. Robust loss functions have emerged as a prevalent solution to this problem. In this work, we introduce the Variation Ratio as a novel property related to the robustness of loss functions, and propose a new family of robust loss functions, termed Variation-Bounded Loss (VBL), which is characterized by a bounded variation ratio. We provide theoretical analyses of the variation ratio, proving that a smaller variation ratio would lead to better robustness. Furthermore, we reveal that the variation ratio provides a feasible method to relax the symmetric condition and offers a more concise path to achieve the asymmetric condition. Based on the variation ratio, we reformulate several commonly used loss functions into a variation-bounded form for practical applications. Positive experiments on various datasets exhibit the effectiveness and flexibility of our approach.

Paper Structure

This paper contains 44 sections, 8 theorems, 24 equations, 3 figures, 5 tables.

Key Result

Lemma 1

For a loss function $L({\mathbf{u}}, y) = c \cdot \ell(u_y)$, we have where $c = \frac{1}{\min_u|\nabla \ell (u)|}$ is a normalization constant.

Figures (3)

  • Figure 1: Left: Absolute values of gradients, i.e., $|\nabla \ell|$. Right: Test accuracies on CIFAR-10 with 0.8 symmetric noise.
  • Figure 2: Visualizations of learned features on CIFAR-10 with 0.4 symmetric noise by t-SNE.
  • Figure 3: Reliability diagrams of CIFAR-10 with 0.8 symmetric noise.

Theorems & Definitions (16)

  • Definition 1: Variation Ratio
  • Definition 2: Variation-Bounded Loss
  • Definition 3: Symmetric Condition
  • Lemma 1
  • Theorem 1: Excess Risk Bound under Symmetric Noise
  • Theorem 2: Excess Risk Bound under Asymmetric and Instance-Dependent Noise
  • Definition 4: Asymmetric Condition
  • Theorem 3
  • Lemma 2
  • proof
  • ...and 6 more