Inaccurate Label Distribution Learning with Dependency Noise
Zhiqiang Kou, Jing Wang, Yuheng Jia, Xin Geng
TL;DR
The paper tackles unstable label distributions caused by instance- and label-dependent noise in LDL by proposing DN-ILDL, which models the noisy distribution as $oldsymbol{oldsymbol{}}=oldsymbol{D}+oldsymbol{E}$ with $oldsymbol{E}=oldsymbol{X}oldsymbol{P}+oldsymbol{Y}oldsymbol{Q}$ and learns a low-rank mapping $oldsymbol{W}$ from features to true distributions. It jointly optimizes $oldsymbol{W}$, $oldsymbol{P}$, and $oldsymbol{Q}$ under a nuclear-norm and group-sparsity regularization, while enforcing graph-regularized alignment between input and output topologies via $oldsymbol{S}$ and $ ilde{oldsymbol{S}}=oldsymbol{ extΦ}(oldsymbol{W},oldsymbol{X},oldsymbol{ abla})$; optimization is performed with ADMM, and theoretical recovery and generalization bounds are established. The method demonstrates strong empirical performance across 13 real-world datasets, outperforming six LDL baselines and one ILDL method, and shows robustness to parameter choices. The work advances LDL by explicitly modeling dependent noise and integrating topology-preserving constraints, with practical impact for noisy annotation scenarios in diverse domains.
Abstract
In this paper, we introduce the Dependent Noise-based Inaccurate Label Distribution Learning (DN-ILDL) framework to tackle the challenges posed by noise in label distribution learning, which arise from dependencies on instances and labels. We start by modeling the inaccurate label distribution matrix as a combination of the true label distribution and a noise matrix influenced by specific instances and labels. To address this, we develop a linear mapping from instances to their true label distributions, incorporating label correlations, and decompose the noise matrix using feature and label representations, applying group sparsity constraints to accurately capture the noise. Furthermore, we employ graph regularization to align the topological structures of the input and output spaces, ensuring accurate reconstruction of the true label distribution matrix. Utilizing the Alternating Direction Method of Multipliers (ADMM) for efficient optimization, we validate our method's capability to recover true labels accurately and establish a generalization error bound. Extensive experiments demonstrate that DN-ILDL effectively addresses the ILDL problem and outperforms existing LDL methods.
