Table of Contents
Fetching ...

Label Distribution Learning with Biased Annotations by Learning Multi-Label Representation

Zhiqiang Kou, Si Qin, Hailin Wang, Mingkun Xie, Shuo Chen, Yuheng Jia, Tongliang Liu, Masashi Sugiyama, Xin Geng

TL;DR

This work tackles label distribution learning (LDL) under biased annotations by reframing the problem: rather than directly denoising distributions, it first converts soft label distributions into hard multi-hot labels and then recovers the true label information while exploiting a low-rank multi-label space to model label correlations. The proposed BLDL framework jointly learns a low-rank multi-label mapping and recovers the true label distributions through a degradation-aware formulation that links $\mathbf{D}$, $\hat{\mathbf{D}}$, and $\hat{\mathbf{L}}$ via an auxiliary matrix $\mathbf{O}$, with a nuclear-norm relaxation and ADMM optimization. Theoretical analysis confirms convergence of the ADMM scheme and provides a generalization bound, while extensive experiments on 12 real-world datasets show that BLDL outperforms state-of-the-art LDL methods under bias, with ablations highlighting the importance of both bias recovery and low-rank multi-label modeling. This approach offers robust, scalable LDL in the presence of annotator bias and high-rank distributions, improving practical label-description learning in complex domains.

Abstract

Multi-label learning (MLL) has gained attention for its ability to represent real-world data. Label Distribution Learning (LDL), an extension of MLL to learning from label distributions, faces challenges in collecting accurate label distributions. To address the issue of biased annotations, based on the low-rank assumption, existing works recover true distributions from biased observations by exploring the label correlations. However, recent evidence shows that the label distribution tends to be full-rank, and naive apply of low-rank approximation on biased observation leads to inaccurate recovery and performance degradation. In this paper, we address the LDL with biased annotations problem from a novel perspective, where we first degenerate the soft label distribution into a hard multi-hot label and then recover the true label information for each instance. This idea stems from an insight that assigning hard multi-hot labels is often easier than assigning a soft label distribution, and it shows stronger immunity to noise disturbances, leading to smaller label bias. Moreover, assuming that the multi-label space for predicting label distributions is low-rank offers a more reasonable approach to capturing label correlations. Theoretical analysis and experiments confirm the effectiveness and robustness of our method on real-world datasets.

Label Distribution Learning with Biased Annotations by Learning Multi-Label Representation

TL;DR

This work tackles label distribution learning (LDL) under biased annotations by reframing the problem: rather than directly denoising distributions, it first converts soft label distributions into hard multi-hot labels and then recovers the true label information while exploiting a low-rank multi-label space to model label correlations. The proposed BLDL framework jointly learns a low-rank multi-label mapping and recovers the true label distributions through a degradation-aware formulation that links , , and via an auxiliary matrix , with a nuclear-norm relaxation and ADMM optimization. Theoretical analysis confirms convergence of the ADMM scheme and provides a generalization bound, while extensive experiments on 12 real-world datasets show that BLDL outperforms state-of-the-art LDL methods under bias, with ablations highlighting the importance of both bias recovery and low-rank multi-label modeling. This approach offers robust, scalable LDL in the presence of annotator bias and high-rank distributions, improving practical label-description learning in complex domains.

Abstract

Multi-label learning (MLL) has gained attention for its ability to represent real-world data. Label Distribution Learning (LDL), an extension of MLL to learning from label distributions, faces challenges in collecting accurate label distributions. To address the issue of biased annotations, based on the low-rank assumption, existing works recover true distributions from biased observations by exploring the label correlations. However, recent evidence shows that the label distribution tends to be full-rank, and naive apply of low-rank approximation on biased observation leads to inaccurate recovery and performance degradation. In this paper, we address the LDL with biased annotations problem from a novel perspective, where we first degenerate the soft label distribution into a hard multi-hot label and then recover the true label information for each instance. This idea stems from an insight that assigning hard multi-hot labels is often easier than assigning a soft label distribution, and it shows stronger immunity to noise disturbances, leading to smaller label bias. Moreover, assuming that the multi-label space for predicting label distributions is low-rank offers a more reasonable approach to capturing label correlations. Theoretical analysis and experiments confirm the effectiveness and robustness of our method on real-world datasets.

Paper Structure

This paper contains 10 sections, 2 theorems, 49 equations, 7 figures, 3 tables.

Key Result

Theorem 4.1

All these iterative solutions $\mathbf{W}, \mathbf{O},\mathbf{D},\mathbf{Z}, \mathbf{\Lambda}$ generated by the above ADMM procedure are bounded and convergent.

Figures (7)

  • Figure 1: Illustration of biased label distribution learning using examples from the RAF dataset li2019blended and the Emotion dataset peng2015mixed. Despite discrepancies between the biased label distribution $\hat{\mathbf{D}}$ and the true distribution $\mathbf{D}$, their corresponding multi-label representations ($\hat{\mathbf{L}}$ and $\mathbf{L}$) are much closer.
  • Figure 2: An overview of the proposed BLDL framework.
  • Figure 3: Error reduction ($\delta_1$ and $\delta_2$) during iterations on the Twitter and Flickr datasets.
  • Figure 4: Reconstruction error in recovering true distributions for different methods during the traning stage.
  • Figure 5: CD diagrams of the comparing methods in terms of each metrics. For the tests, CD equals 2.3296 at 0.05 signifcance level.
  • ...and 2 more figures

Theorems & Definitions (3)

  • Theorem 4.1
  • Theorem 4.2
  • proof