Building Bridges between Regression, Clustering, and Classification

Lawrence Stewart; Francis Bach; Quentin Berthet

Building Bridges between Regression, Clustering, and Classification

Lawrence Stewart, Francis Bach, Quentin Berthet

TL;DR

The paper addresses the challenge that regression with squared loss can underperform neural models and shows that reframing regression as classification via a learnable target encoder–decoder improves training dynamics and predictions. By mapping targets to distributions over $k$ classes on the simplex and decoding with a linear head, the approach blends discrete and continuous representations; variants include hard and soft binning, pre-trained encoders, and an end-to-end joint objective that balances auto-encoding, KL alignment, and regression loss. Across eight real-world datasets, soft-binning generally outperforms hard binning, and the end-to-end joint training achieves the best performance, with reported gains up to 25% over least-squares on average. The method offers improved predictive accuracy, interpretable decoders, and a flexible framework to interpolate between regression and classification objectives, with practical implications for regression tasks in diverse domains.

Abstract

Regression, the task of predicting a continuous scalar target y based on some features x is one of the most fundamental tasks in machine learning and statistics. It has been observed and theoretically analyzed that the classical approach, meansquared error minimization, can lead to suboptimal results when training neural networks. In this work, we propose a new method to improve the training of these models on regression tasks, with continuous scalar targets. Our method is based on casting this task in a different fashion, using a target encoder, and a prediction decoder, inspired by approaches in classification and clustering. We showcase the performance of our method on a wide range of real-world datasets.

Building Bridges between Regression, Clustering, and Classification

TL;DR

classes on the simplex and decoding with a linear head, the approach blends discrete and continuous representations; variants include hard and soft binning, pre-trained encoders, and an end-to-end joint objective that balances auto-encoding, KL alignment, and regression loss. Across eight real-world datasets, soft-binning generally outperforms hard binning, and the end-to-end joint training achieves the best performance, with reported gains up to 25% over least-squares on average. The method offers improved predictive accuracy, interpretable decoders, and a flexible framework to interpolate between regression and classification objectives, with practical implications for regression tasks in diverse domains.

Building Bridges between Regression, Clustering, and Classification

TL;DR

Abstract

Building Bridges between Regression, Clustering, and Classification

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)