Improve Representation for Imbalanced Regression through Geometric Constraints
Zijian Dong, Yilei Wu, Chongyao Chen, Yingtian Zou, Yichi Zhang, Juan Helen Zhou
TL;DR
This work tackles the problem of learning uniform representations for deep imbalanced regression (DIR), where labels are continuous and ordered. It introduces geometry-inspired losses—enveloping loss to encourage a latent trace to envelop the unit hypersphere $S^{n-1}$ and homogeneity loss to promote smooth, evenly spaced representations along the trace—and integrates them via Surrogate-driven Representation Learning (SRL). SRL constructs a surrogate, centroid-based representation across all label bins and pairs it with a contrastive objective, enabling global geometric regularization; the authors further introduce Imbalanced Operator Learning (IOL) as a practical benchmark. Across real-world regression tasks and IOL, SRL improves accuracy, especially in few-shot regions, and yields gains when combined with existing DIR methods, all with manageable computational cost. This geometry-based approach offers a principled path to robust representations under label imbalance and expands DIR evaluation through IOL.
Abstract
In representation learning, uniformity refers to the uniform feature distribution in the latent space (i.e., unit hypersphere). Previous work has shown that improving uniformity contributes to the learning of under-represented classes. However, most of the previous work focused on classification; the representation space of imbalanced regression remains unexplored. Classification-based methods are not suitable for regression tasks because they cluster features into distinct groups without considering the continuous and ordered nature essential for regression. In a geometric aspect, we uniquely focus on ensuring uniformity in the latent space for imbalanced regression through two key losses: enveloping and homogeneity. The enveloping loss encourages the induced trace to uniformly occupy the surface of a hypersphere, while the homogeneity loss ensures smoothness, with representations evenly spaced at consistent intervals. Our method integrates these geometric principles into the data representations via a Surrogate-driven Representation Learning (SRL) framework. Experiments with real-world regression and operator learning tasks highlight the importance of uniformity in imbalanced regression and validate the efficacy of our geometry-based loss functions.
