Table of Contents
Fetching ...

Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning

Jia-Hao Xiao, Ming-Kun Xie, Heng-Bo Fan, Gang Niu, Masashi Sugiyama, Sheng-Jun Huang

TL;DR

This work tackles semi-supervised multi-label learning by addressing two intertwined challenges: generating high-quality pseudo-labels and selecting effective per-class thresholds. It introduces Dual-Decoupling Learning (D2L), which separates correlative and discriminative feature learning and decouples pseudo-label generation from utilization, and Metric-Adaptive Thresholding (MAT), which finds class-wise thresholds by maximizing a chosen metric on labeled data. Empirical results on VOC, COCO, and NUS demonstrate state-of-the-art gains across labeled-proportion regimes, with ablations confirming substantial contributions from both D2L components and MAT, especially when labeled data is scarce. The approach yields richer pseudo-labels and more reliable thresholds, improving label assignment under diverse true-label distributions and offering practical benefits for scalable SSMLL deployments.

Abstract

Semi-supervised multi-label learning (SSMLL) is a powerful framework for leveraging unlabeled data to reduce the expensive cost of collecting precise multi-label annotations. Unlike semi-supervised learning, one cannot select the most probable label as the pseudo-label in SSMLL due to multiple semantics contained in an instance. To solve this problem, the mainstream method developed an effective thresholding strategy to generate accurate pseudo-labels. Unfortunately, the method neglected the quality of model predictions and its potential impact on pseudo-labeling performance. In this paper, we propose a dual-perspective method to generate high-quality pseudo-labels. To improve the quality of model predictions, we perform dual-decoupling to boost the learning of correlative and discriminative features, while refining the generation and utilization of pseudo-labels. To obtain proper class-wise thresholds, we propose the metric-adaptive thresholding strategy to estimate the thresholds, which maximize the pseudo-label performance for a given metric on labeled data. Experiments on multiple benchmark datasets show the proposed method can achieve the state-of-the-art performance and outperform the comparative methods with a significant margin.

Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning

TL;DR

This work tackles semi-supervised multi-label learning by addressing two intertwined challenges: generating high-quality pseudo-labels and selecting effective per-class thresholds. It introduces Dual-Decoupling Learning (D2L), which separates correlative and discriminative feature learning and decouples pseudo-label generation from utilization, and Metric-Adaptive Thresholding (MAT), which finds class-wise thresholds by maximizing a chosen metric on labeled data. Empirical results on VOC, COCO, and NUS demonstrate state-of-the-art gains across labeled-proportion regimes, with ablations confirming substantial contributions from both D2L components and MAT, especially when labeled data is scarce. The approach yields richer pseudo-labels and more reliable thresholds, improving label assignment under diverse true-label distributions and offering practical benefits for scalable SSMLL deployments.

Abstract

Semi-supervised multi-label learning (SSMLL) is a powerful framework for leveraging unlabeled data to reduce the expensive cost of collecting precise multi-label annotations. Unlike semi-supervised learning, one cannot select the most probable label as the pseudo-label in SSMLL due to multiple semantics contained in an instance. To solve this problem, the mainstream method developed an effective thresholding strategy to generate accurate pseudo-labels. Unfortunately, the method neglected the quality of model predictions and its potential impact on pseudo-labeling performance. In this paper, we propose a dual-perspective method to generate high-quality pseudo-labels. To improve the quality of model predictions, we perform dual-decoupling to boost the learning of correlative and discriminative features, while refining the generation and utilization of pseudo-labels. To obtain proper class-wise thresholds, we propose the metric-adaptive thresholding strategy to estimate the thresholds, which maximize the pseudo-label performance for a given metric on labeled data. Experiments on multiple benchmark datasets show the proposed method can achieve the state-of-the-art performance and outperform the comparative methods with a significant margin.
Paper Structure (33 sections, 8 equations, 10 figures, 7 tables, 1 algorithm)

This paper contains 33 sections, 8 equations, 10 figures, 7 tables, 1 algorithm.

Figures (10)

  • Figure 1: An illustration of the proposed learning framework. Blue and yellow colors represent correlative-wise and discriminative-wise components, while solid and dashed lines denote the strong and weak data augmentation streams, respectively. 'Spatially-Weighted Sum' indicates the aggregation of probabilities from patches (see \ref{['eq:local_sum']}). The pseudo-label generation process is gradient-free.
  • Figure 2: The performance of pseudo-labeling during training stage on COCO.
  • Figure 3: The analyses of parameters in D2L and MAT: (a-b) The results of various metric functions $\mathcal{M}(\cdot,\cdot)$ used in MAT and different $\beta$ values used in metric $F_\beta$, at $p=\{0.05,0.1,0.15,0.2\}$ on COCO; (c-d) The analyses of two parameters, number of patches $n$ and temperature $\alpha$ in D2L framework, at $p=0.05$ on three datasets. The parameter analyses under other settings will be presented in Appendix E.
  • Figure 4: Visualization of attention maps on COCO. Each patch is cropped from the original image starting from the beginning of a row. The class label attached in front of every original image or cropped patch is activated in the attention map.
  • Figure 5: An illustration of MAT. By feeding instances into the model $f(\cdot)\circ\widehat{h}(\cdot)$, we obtain the predictions. By adjusting $\tau_k$, we can achieve the optimal pseudo-labeling performance $\mathcal{M}(\hat{Y}_k, Y_k)$.
  • ...and 5 more figures