Positive Label Is All You Need for Multi-Label Classification

Zhixiang Yuan; Kaixin Zhang; Tao Huang

Positive Label Is All You Need for Multi-Label Classification

Zhixiang Yuan, Kaixin Zhang, Tao Huang

TL;DR

The paper tackles label noise in multi-label image classification by discarding negative labels and training with positive and unlabeled data via a variational PU loss, avoiding reliance on class priors. It introduces per-category $L_\mathrm{var}^{(c)}$ and Mixup-based $L_\mathrm{reg}^{(c)}$ to form $L_\mathrm{pu-mlc}$, augmented by a dynamic re-balance factor $p_c^\gamma$ and an adaptive per-category temperature $\tau^{(c)}$, plus a Local-Global Convolution module (LgConv) to capture local and global image dependencies without retraining the backbone. The approach yields state-of-the-art results on MS-COCO and PASCAL VOC under varying known-label scenarios, while significantly reducing annotation costs. This work offers a robust and scalable pathway for deploying multi-label models in noisy labeling environments, with practical impact on data labeling efficiency and model reliability.

Abstract

Multi-label classification (MLC) faces challenges from label noise in training data due to annotating diverse semantic labels for each image. Current methods mainly target identifying and correcting label mistakes using trained MLC models, but still struggle with persistent noisy labels during training, resulting in imprecise recognition and reduced performance. Our paper addresses label noise in MLC by introducing a positive and unlabeled multi-label classification (PU-MLC) method. To counteract noisy labels, we directly discard negative labels, focusing on the abundance of negative labels and the origin of most noisy labels. PU-MLC employs positive-unlabeled learning, training the model with only positive labels and unlabeled data. The method incorporates adaptive re-balance factors and temperature coefficients in the loss function to address label distribution imbalance and prevent over-smoothing of probabilities during training. Additionally, we introduce a local-global convolution module to capture both local and global dependencies in the image without requiring backbone retraining. PU-MLC proves effective on MLC and MLC with partial labels (MLC-PL) tasks, demonstrating significant improvements on MS-COCO and PASCAL VOC datasets with fewer annotations. Code is available at: https://github.com/TAKELAMAG/PU-MLC.

Positive Label Is All You Need for Multi-Label Classification

TL;DR

and Mixup-based

to form

, augmented by a dynamic re-balance factor

and an adaptive per-category temperature

, plus a Local-Global Convolution module (LgConv) to capture local and global image dependencies without retraining the backbone. The approach yields state-of-the-art results on MS-COCO and PASCAL VOC under varying known-label scenarios, while significantly reducing annotation costs. This work offers a robust and scalable pathway for deploying multi-label models in noisy labeling environments, with practical impact on data labeling efficiency and model reliability.

Abstract

Paper Structure (13 sections, 11 equations, 3 figures, 3 tables)

This paper contains 13 sections, 11 equations, 3 figures, 3 tables.

Introduction
Related Work
Multi-Label Classification
Positive-Unlabeled (PU) learning
Proposed Approach: PU-MLC
MLC as PU learning
Catastrophic Imbalance of Label Distribution
Adaptive Temperature Coefficient
Local-Global Convolution
Experiments
Results on MS-COCO
Results on Pascal VOC 2007
Conclusion

Figures (3)

Figure 1: Comparisons of different learning methods in MLC. (a) an image which has two missing labels. To train the sample image, (b) missing labels in traditional MLC methods are mistakenly classified as negative labels; (c) MLC-PL samples a proportion of labels, but still encounters false negative labels; (d) Our method treats all negative labels as unlabeled ones. Blue, red, and yellow icons denote positive, negative, and unknown labels, respectively.
Figure 2: (a) Overview of our proposed PU-MLC. Instead of using positive and negative labels in the traditional MLC method (red box), our PU-MLC conducts a positive-unlabeled (PU) learning strategy with only partial positive labels leveraged. Besides, we introduce mixup regularization loss and the adaptive temperature coefficient module to further boost the performance. Additionally, we enhance the global representations in backbone by integrating a local-global convolution module to every $3\times 3$ local convolutions. Std: standard deviation. (b) Histograms of the number of positive and negative labels in each category. We randomly select 20 categories from MS-COCO train set.
Figure 3: Illustration of Local-global convolution.

Positive Label Is All You Need for Multi-Label Classification

TL;DR

Abstract

Positive Label Is All You Need for Multi-Label Classification

Authors

TL;DR

Abstract

Table of Contents

Figures (3)