On the Equivalence of Regression and Classification
Jayadeva, Naman Dwivedi, Hari Krishnan, N. M. Anoop Krishnan
TL;DR
This work establishes a formal regression-classification equivalence by showing that an $M$-sample regression task with all points on a hyperplane corresponds to a linearly separable classification task with $2M$ samples, where the regression can be recovered from the classifier. It reframes regression through an equivalent SVC problem and introduces a regressability measure that estimates regression difficulty without fitting a model. The authors further propose learning a linearizing map $\phi(x)$ via a neural network using the $J_4$ loss, so that $z = w^T \phi(x)$, enabling a two-step regression process that avoids extensive hyperparameter tuning. Experimental results on large, challenging datasets show that enforcing linearity in the learned representation improves predictive performance (higher $R^2$) relative to a strong neural baseline, highlighting both theoretical unification and practical benefits for regression tasks.
Abstract
A formal link between regression and classification has been tenuous. Even though the margin maximization term $\|w\|$ is used in support vector regression, it has at best been justified as a regularizer. We show that a regression problem with $M$ samples lying on a hyperplane has a one-to-one equivalence with a linearly separable classification task with $2M$ samples. We show that margin maximization on the equivalent classification task leads to a different regression formulation than traditionally used. Using the equivalence, we demonstrate a ``regressability'' measure, that can be used to estimate the difficulty of regressing a dataset, without needing to first learn a model for it. We use the equivalence to train neural networks to learn a linearizing map, that transforms input variables into a space where a linear regressor is adequate.
