Table of Contents
Fetching ...

Forming Auxiliary High-confident Instance-level Loss to Promote Learning from Label Proportions

Tianhao Ma, Han Chen, Juncheng Hu, Yungang Zhu, Ximing Li

TL;DR

This work tackles LLP by addressing the frequent degradation caused by inaccurate, over-smoothed pseudo-labels when using large bags. It introduces L$^2$p-ahil, which couples a bag-level LLP loss with a high-confident instance-level loss through Dual Entropy-based Weighting (DEW) that combines bag- and instance-level entropies to gauge confidence. The method yields state-of-the-art results across multiple benchmarks, with notable gains as bag size increases, and demonstrates that adaptive weighting fosters more discriminative representations. The approach advances weakly-supervised learning in LLP by providing a principled, entropy-driven mechanism to selectively leverage pseudo-labels, with practical impact in settings where instance-level labels are costly or unavailable.

Abstract

Learning from label proportions (LLP), i.e., a challenging weakly-supervised learning task, aims to train a classifier by using bags of instances and the proportions of classes within bags, rather than annotated labels for each instance. Beyond the traditional bag-level loss, the mainstream methodology of LLP is to incorporate an auxiliary instance-level loss with pseudo-labels formed by predictions. Unfortunately, we empirically observed that the pseudo-labels are are often inaccurate due to over-smoothing, especially for the scenarios with large bag sizes, hurting the classifier induction. To alleviate this problem, we suggest a novel LLP method, namely Learning from Label Proportions with Auxiliary High-confident Instance-level Loss (L^2P-AHIL). Specifically, we propose a dual entropy-based weight (DEW) method to adaptively measure the confidences of pseudo-labels. It simultaneously emphasizes accurate predictions at the bag level and avoids overly smoothed predictions. We then form high-confident instance-level loss with DEW, and jointly optimize it with the bag-level loss in a self-training manner. The experimental results on benchmark datasets show that L^2P-AHIL can surpass the existing baseline methods, and the performance gain can be more significant as the bag size increases. The implementation of our method is available at https://github.com/TianhaoMa5/LLP-AHIL.

Forming Auxiliary High-confident Instance-level Loss to Promote Learning from Label Proportions

TL;DR

This work tackles LLP by addressing the frequent degradation caused by inaccurate, over-smoothed pseudo-labels when using large bags. It introduces Lp-ahil, which couples a bag-level LLP loss with a high-confident instance-level loss through Dual Entropy-based Weighting (DEW) that combines bag- and instance-level entropies to gauge confidence. The method yields state-of-the-art results across multiple benchmarks, with notable gains as bag size increases, and demonstrates that adaptive weighting fosters more discriminative representations. The approach advances weakly-supervised learning in LLP by providing a principled, entropy-driven mechanism to selectively leverage pseudo-labels, with practical impact in settings where instance-level labels are costly or unavailable.

Abstract

Learning from label proportions (LLP), i.e., a challenging weakly-supervised learning task, aims to train a classifier by using bags of instances and the proportions of classes within bags, rather than annotated labels for each instance. Beyond the traditional bag-level loss, the mainstream methodology of LLP is to incorporate an auxiliary instance-level loss with pseudo-labels formed by predictions. Unfortunately, we empirically observed that the pseudo-labels are are often inaccurate due to over-smoothing, especially for the scenarios with large bag sizes, hurting the classifier induction. To alleviate this problem, we suggest a novel LLP method, namely Learning from Label Proportions with Auxiliary High-confident Instance-level Loss (L^2P-AHIL). Specifically, we propose a dual entropy-based weight (DEW) method to adaptively measure the confidences of pseudo-labels. It simultaneously emphasizes accurate predictions at the bag level and avoids overly smoothed predictions. We then form high-confident instance-level loss with DEW, and jointly optimize it with the bag-level loss in a self-training manner. The experimental results on benchmark datasets show that L^2P-AHIL can surpass the existing baseline methods, and the performance gain can be more significant as the bag size increases. The implementation of our method is available at https://github.com/TianhaoMa5/LLP-AHIL.

Paper Structure

This paper contains 26 sections, 10 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Before training, the data is grouped and only the proportion information is retained. Classifier training is performed using unlabeled data and proportions in bag units, and finally the instance classification task is completed.
  • Figure 2: Results of preliminary experiments with different bag sizes on CIFAR-10 and CIFAR-100 (500 epochs). (a)(c) Averaged accuracy of pseudo-labels on CIFAR-10 and CIFAR-100. (b)(d) Averaged normalized entropy of pseudo-labels on CIFAR-10 and CIFAR-100. Higher normalized entropy values imply more smoothing results. Table \ref{['tab:ablation']} shows the adverse effects caused by these inaccurate pseudo-labels.
  • Figure 3: (a) The plot of the model's pipeline with the "Weak and Strong Augmentation" strategy (aug. denotes augmentation). When calculating the instance-level loss $\mathcal{L}_\mathrm{i}$, $\mathbf{\hat{y}}^\mathrm{s}$ is used as input samples and $\mathbf{\tilde{y}}$ as targets. Adaptive weight components are omitted for visual clarity. (b) The plot of the detail process of DEW. The process involves: 1) acquiring predicted and reference distributions at both bag and instance levels (red and green pathways, respectively), 2) calculating entropy for these distributions, and 3) determining adaptive weights through a mapping function that integrates information from both levels. $\text{L1-}\mathcal{N}$ indicates L1-Normalization operations.
  • Figure 4: Evolution of adaptive weights across training epochs for bag sizes 16, 64 and 256 on CIFAR-10 and CIFAR-100.
  • Figure 5: t-SNE visualization of learned feature spaces on CIFAR-10 with bag size 256. Distinct colors correspond to different classes, where (a) DLLP and (b) ROT baselines exhibit overlapping clusters, while (c) our method L$^2$p-ahil demonstrates clearer inter-class separation and tighter intra-class clustering, indicating more discriminative feature learning.
  • ...and 1 more figures