No Regularization is Needed: An Efficient and Effective Model for Incomplete Label Distribution Learning
Xiang Li, Songcan Chen
TL;DR
The paper tackles incomplete Label Distribution Learning (InLDL) by leveraging a degree-based prior to weight losses, aiming to avoid explicit regularization. It derives data-dependent upper bounds between the expected risk and the weighted empirical risk, showing that weighting introduces implicit regularization through the factor ${\| \mathbf{P} \|_\infty}$ and recovers classical bounds when $C=1$; the framework centers on the weighted empirical risk $\hat{\mathcal{R}}_{S}(f)$ with $\mathcal{L}$ as the loss and $\mathbf{P}$ as the weight matrix. The authors propose WInLDL, a linear model $f(\mathbf{X}) = \mathbf{XW}$ solved via ADMM, with a simplex-constrained output and a weighting matrix $\mathbf{Q}$ informed by the degree prior, achieving closed-form subproblem updates and linear time complexity in the number of samples $N$. Empirical results on ten real datasets show WInLDL is competitive with state-of-the-art methods and demonstrates strong scalability, validating the practical value of incorporating degree priors without explicit regularization.
Abstract
Label Distribution Learning (LDL) assigns soft labels, a.k.a. degrees, to a sample. In reality, it is always laborious to obtain complete degrees, giving birth to the Incomplete LDL (InLDL). However, InLDL often suffers from performance degeneration. To remedy it, existing methods need one or more explicit regularizations, leading to burdensome parameter tuning and extra computation. We argue that label distribution itself may provide useful prior, when used appropriately, the InLDL problem can be solved without any explicit regularization. In this paper, we offer a rational alternative to use such a prior. Our intuition is that large degrees are likely to get more concern, the small ones are easily overlooked, whereas the missing degrees are completely neglected in InLDL. To learn an accurate label distribution, it is crucial not to ignore the small observed degrees but to give them properly large weights, while gradually increasing the weights of the missing degrees. To this end, we first define a weighted empirical risk and derive upper bounds between the expected risk and the weighted empirical risk, which reveals in principle that weighting plays an implicit regularization role. Then, by using the prior of degrees, we design a weighted scheme and verify its effectiveness. To sum up, our model has four advantages, it is 1) model selection free, as no explicit regularization is imposed; 2) with closed form solution (sub-problem) and easy-to-implement (a few lines of codes); 3) with linear computational complexity in the number of samples, thus scalable to large datasets; 4) competitive with state-of-the-arts even without any explicit regularization.
