Table of Contents
Fetching ...

CroSel: Cross Selection of Confident Pseudo Labels for Partial-Label Learning

Shiyu Tian, Hongxin Wei, Yiqun Wang, Lei Feng

TL;DR

CroSel tackles partial-label learning by leveraging historical predictions through a cross selection strategy between two models and a co-mix consistency regularization that uses weak/strong augmentations and MixUp to provide targets for unselected samples. The method achieves state-of-the-art results on benchmarks like SVHN, CIFAR-10, and CIFAR-100, with high true-label selection ratios (often >90%) and robust ablation-supported improvements. By combining a memory-bank–driven label selection with dynamic supervision and ensembling at test time, CroSel reduces label ambiguity and sustains learning signals throughout training. This approach offers a practical and scalable path for PLL in real-world datasets with noisy candidate labels and limited supervision.

Abstract

Partial-label learning (PLL) is an important weakly supervised learning problem, which allows each training example to have a candidate label set instead of a single ground-truth label. Identification-based methods have been widely explored to tackle label ambiguity issues in PLL, which regard the true label as a latent variable to be identified. However, identifying the true labels accurately and completely remains challenging, causing noise in pseudo labels during model training. In this paper, we propose a new method called CroSel, which leverages historical predictions from the model to identify true labels for most training examples. First, we introduce a cross selection strategy, which enables two deep models to select true labels of partially labeled data for each other. Besides, we propose a novel consistency regularization term called co-mix to avoid sample waste and tiny noise caused by false selection. In this way, CroSel can pick out the true labels of most examples with high precision. Extensive experiments demonstrate the superiority of CroSel, which consistently outperforms previous state-of-the-art methods on benchmark datasets. Additionally, our method achieves over 90\% accuracy and quantity for selecting true labels on CIFAR-type datasets under various settings.

CroSel: Cross Selection of Confident Pseudo Labels for Partial-Label Learning

TL;DR

CroSel tackles partial-label learning by leveraging historical predictions through a cross selection strategy between two models and a co-mix consistency regularization that uses weak/strong augmentations and MixUp to provide targets for unselected samples. The method achieves state-of-the-art results on benchmarks like SVHN, CIFAR-10, and CIFAR-100, with high true-label selection ratios (often >90%) and robust ablation-supported improvements. By combining a memory-bank–driven label selection with dynamic supervision and ensembling at test time, CroSel reduces label ambiguity and sustains learning signals throughout training. This approach offers a practical and scalable path for PLL in real-world datasets with noisy candidate labels and limited supervision.

Abstract

Partial-label learning (PLL) is an important weakly supervised learning problem, which allows each training example to have a candidate label set instead of a single ground-truth label. Identification-based methods have been widely explored to tackle label ambiguity issues in PLL, which regard the true label as a latent variable to be identified. However, identifying the true labels accurately and completely remains challenging, causing noise in pseudo labels during model training. In this paper, we propose a new method called CroSel, which leverages historical predictions from the model to identify true labels for most training examples. First, we introduce a cross selection strategy, which enables two deep models to select true labels of partially labeled data for each other. Besides, we propose a novel consistency regularization term called co-mix to avoid sample waste and tiny noise caused by false selection. In this way, CroSel can pick out the true labels of most examples with high precision. Extensive experiments demonstrate the superiority of CroSel, which consistently outperforms previous state-of-the-art methods on benchmark datasets. Additionally, our method achieves over 90\% accuracy and quantity for selecting true labels on CIFAR-type datasets under various settings.
Paper Structure (19 sections, 9 equations, 3 figures, 11 tables)

This paper contains 19 sections, 9 equations, 3 figures, 11 tables.

Figures (3)

  • Figure 1: The left side of the figure is a brief example of our memory bank(MB) that stores the softmax output of the model for the last $t$ epochs, which is updated by the FIFO (First In First Out) principle; the middle is the cross selection strategy: within each epoch, data subsets $\mathcal{D}_{\mathrm{sel}}$ with confident pseudo-labels are selected from the MB of each network, which produce loss function $\mathcal{L}_{\mathrm{l}}$ to the training process for the other network; the right side illustrates our co-mix regularization term and the corresponding loss function $\mathcal{L}_{\mathrm{cr}}$ .
  • Figure 2: Parameter analysis of $\lambda_{\mathrm{cr}}$ on CIFAR-10 and CIFAR-100.
  • Figure 3: Selection ratio comparison between dual model and single model on CIAFR-10 and CIAFR-100.