Label Distribution Learning with Biased Annotations by Learning Multi-Label Representation
Zhiqiang Kou, Si Qin, Hailin Wang, Mingkun Xie, Shuo Chen, Yuheng Jia, Tongliang Liu, Masashi Sugiyama, Xin Geng
TL;DR
This work tackles label distribution learning (LDL) under biased annotations by reframing the problem: rather than directly denoising distributions, it first converts soft label distributions into hard multi-hot labels and then recovers the true label information while exploiting a low-rank multi-label space to model label correlations. The proposed BLDL framework jointly learns a low-rank multi-label mapping and recovers the true label distributions through a degradation-aware formulation that links $\mathbf{D}$, $\hat{\mathbf{D}}$, and $\hat{\mathbf{L}}$ via an auxiliary matrix $\mathbf{O}$, with a nuclear-norm relaxation and ADMM optimization. Theoretical analysis confirms convergence of the ADMM scheme and provides a generalization bound, while extensive experiments on 12 real-world datasets show that BLDL outperforms state-of-the-art LDL methods under bias, with ablations highlighting the importance of both bias recovery and low-rank multi-label modeling. This approach offers robust, scalable LDL in the presence of annotator bias and high-rank distributions, improving practical label-description learning in complex domains.
Abstract
Multi-label learning (MLL) has gained attention for its ability to represent real-world data. Label Distribution Learning (LDL), an extension of MLL to learning from label distributions, faces challenges in collecting accurate label distributions. To address the issue of biased annotations, based on the low-rank assumption, existing works recover true distributions from biased observations by exploring the label correlations. However, recent evidence shows that the label distribution tends to be full-rank, and naive apply of low-rank approximation on biased observation leads to inaccurate recovery and performance degradation. In this paper, we address the LDL with biased annotations problem from a novel perspective, where we first degenerate the soft label distribution into a hard multi-hot label and then recover the true label information for each instance. This idea stems from an insight that assigning hard multi-hot labels is often easier than assigning a soft label distribution, and it shows stronger immunity to noise disturbances, leading to smaller label bias. Moreover, assuming that the multi-label space for predicting label distributions is low-rank offers a more reasonable approach to capturing label correlations. Theoretical analysis and experiments confirm the effectiveness and robustness of our method on real-world datasets.
