Learning Triangular Distribution in Visual World
Ping Chen, Xingpeng Zhang, Chengtao Zhou, Dichao Fan, Peng Tu, Le Zhang, Yanlin Qian
TL;DR
The paper addresses the mismatch between nonlinear visual features and quasi-continuous labels in label distribution learning by introducing a parameter-free Triangular Distribution Transform (TDT) that creates an injective, linear mapping between feature differences and label differences. TDT uses a symmetric triangular distribution to approximate Gaussian feature differences and learns via a combination of symmetry, commutativity, and supervisory losses so that a linear head can predict labels from transformed features. A prior-sample contrastive-like mechanism guides learning and makes TDT a practical plug-in for standard backbones. Empirical results on facial age estimation, image aesthetics, and illumination estimation show competitive or superior performance compared with state-of-the-art methods, validating TDT’s effectiveness and simplicity for visual regression tasks. The approach offers a lightweight, interpretable pathway to linearize complex feature-label mappings in real-world vision problems.
Abstract
Convolution neural network is successful in pervasive vision tasks, including label distribution learning, which usually takes the form of learning an injection from the non-linear visual features to the well-defined labels. However, how the discrepancy between features is mapped to the label discrepancy is ambient, and its correctness is not guaranteed.To address these problems, we study the mathematical connection between feature and its label, presenting a general and simple framework for label distribution learning. We propose a so-called Triangular Distribution Transform (TDT) to build an injective function between feature and label, guaranteeing that any symmetric feature discrepancy linearly reflects the difference between labels. The proposed TDT can be used as a plug-in in mainstream backbone networks to address different label distribution learning tasks. Experiments on Facial Age Recognition, Illumination Chromaticity Estimation, and Aesthetics assessment show that TDT achieves on-par or better results than the prior arts.
