Unimodal Distributions for Ordinal Regression
Jaime S. Cardoso, Ricardo Cruz, Tomé Albuquerque
TL;DR
This work tackles ordinal regression by enforcing unimodal output distributions, addressing the limitations of cross-entropy for ordered labels. It provides a theoretical analysis of unimodal distributions on the probability simplex, proving connectedness properties and offering a practical Wasserstein projection framework for soft unimodality, alongside a hard unimodal architecture (UnimodalNet). The authors introduce two independent approaches—UnimodalNet and Wasserstein Regularization—demonstrating strong unimodality and competitive ordinal performance across ten datasets, with the hard method guaranteeing unimodality and the soft method offering a principled optimization path. The results highlight a favorable trade-off between unimodality and predictive accuracy, and the work contributes a solid theoretical foundation plus open-source tooling for reproducibility.
Abstract
In many real-world prediction tasks, class labels contain information about the relative order between labels that are not captured by commonly used loss functions such as multicategory cross-entropy. Recently, the preference for unimodal distributions in the output space has been incorporated into models and loss functions to account for such ordering information. However, current approaches rely on heuristics that lack a theoretical foundation. Here, we propose two new approaches to incorporate the preference for unimodal distributions into the predictive model. We analyse the set of unimodal distributions in the probability simplex and establish fundamental properties. We then propose a new architecture that imposes unimodal distributions and a new loss term that relies on the notion of projection in a set to promote unimodality. Experiments show the new architecture achieves top-2 performance, while the proposed new loss term is very competitive while maintaining high unimodality.
