Learning with Complementary Labels Revisited: The Selected-Completely-at-Random Setting Is More Practical

Wei Wang; Takashi Ishida; Yu-Jie Zhang; Gang Niu; Masashi Sugiyama

Learning with Complementary Labels Revisited: The Selected-Completely-at-Random Setting Is More Practical

Wei Wang, Takashi Ishida, Yu-Jie Zhang, Gang Niu, Masashi Sugiyama

TL;DR

This work tackles learning from complementary labels without relying on the common uniform-distribution or anchor-label assumptions. It introduces SCARCE, a Selected-Completely-at-Random based approach that yields an unbiased risk estimator and a risk-correction mechanism to control overfitting, while revealing a natural connection to negative-unlabeled learning under an OVR framework. Theoretical results establish calibration to the 0-1 loss and an estimation error bound, ensuring consistency as data grow. Empirically, SCARCE outperforms state-of-the-art methods on diverse synthetic and real-world benchmarks, demonstrating practical effectiveness and robustness to non-uniform labeling and mild priors mis-specification. The work also discusses class-prior estimation via Best Bin Estimation, highlighting a feasible workflow for practical deployment.

Abstract

Complementary-label learning is a weakly supervised learning problem in which each training example is associated with one or multiple complementary labels indicating the classes to which it does not belong. Existing consistent approaches have relied on the uniform distribution assumption to model the generation of complementary labels, or on an ordinary-label training set to estimate the transition matrix in non-uniform cases. However, either condition may not be satisfied in real-world scenarios. In this paper, we propose a novel consistent approach that does not rely on these conditions. Inspired by the positive-unlabeled (PU) learning literature, we propose an unbiased risk estimator based on the Selected-Completely-at-Random assumption for complementary-label learning. We then introduce a risk-correction approach to address overfitting problems. Furthermore, we find that complementary-label learning can be expressed as a set of negative-unlabeled binary classification problems when using the one-versus-rest strategy. Extensive experimental results on both synthetic and real-world benchmark datasets validate the superiority of our proposed approach over state-of-the-art methods.

Learning with Complementary Labels Revisited: The Selected-Completely-at-Random Setting Is More Practical

TL;DR

Abstract

Paper Structure (45 sections, 17 theorems, 70 equations, 2 figures, 7 tables, 4 algorithms)

This paper contains 45 sections, 17 theorems, 70 equations, 2 figures, 7 tables, 4 algorithms.

Introduction
Preliminaries
Learning with Ordinary Labels
Learning with Complementary Labels
Learning from Positive and Unlabeled Data
Generation Process of Complementary Labels
Methodology
Risk Rewrite
OVR Strategy
Relation to Negative-Unlabeled Learning
Theoretical Analysis
Calibration.
Estimation error bound.
Risk-Correction Approach
Experiments
...and 30 more sections

Key Result

Theorem 2

Under Assumption scar, the classification risk in Eq. (ordinary_risk) can be equivalently expressed as

Figures (2)

Figure 1: Training curves and test curves of the method that minimizes the URE and test curves of our proposed risk-correction approach SCARCE. The green dashed lines indicate when the URE becomes negative while the yellow dashed lines indicate when the overfitting phenomena occur. The complementary labels are generated by following the uniform distribution assumption. ResNet is used as the model architecture for CIFAR-10 and MLP is used for other datasets.
Figure 2: (a) Classification accuracy of different instantiations of SCARCE on different datasets. (b) Classification accuracy given inaccurate class priors.

Theorems & Definitions (32)

Theorem 2
Theorem 3
Lemma 4
Theorem 5
Remark 6
Theorem 7
Remark 8
Theorem 9
Theorem 10
Remark 11
...and 22 more

Learning with Complementary Labels Revisited: The Selected-Completely-at-Random Setting Is More Practical

TL;DR

Abstract

Learning with Complementary Labels Revisited: The Selected-Completely-at-Random Setting Is More Practical

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (32)