Engression: Extrapolation through the Lens of Distributional Regression
Xinwei Shen, Nicolai Meinshausen
TL;DR
Engression introduces a neural-network-based distributional regression framework that directly models the full conditional distribution $Y|X=x$ via a generative transform, enabling sampling and high-dimensional outcome handling. By combining distributional regression with pre-additive noise models (pre-ANMs), engression provides a new approach to extrapolation for nonlinear relationships, with theory showing distributional extrapolability under mild monotonicity and noise assumptions. Finite-sample analyses establish consistency and error rates outside the training support in well-specified settings, while simulations and extensive real-data experiments demonstrate robust extrapolation advantages over traditional L1/L2 regression and quantile-based methods. The method yields accurate prediction intervals and distributional predictions beyond the training support, offering a practical, scalable tool for tasks requiring reliable extrapolation and uncertainty quantification in nonlinear regimes.
Abstract
Distributional regression aims to estimate the full conditional distribution of a target variable, given covariates. Popular methods include linear and tree-ensemble based quantile regression. We propose a neural network-based distributional regression methodology called `engression'. An engression model is generative in the sense that we can sample from the fitted conditional distribution and is also suitable for high-dimensional outcomes. Furthermore, we find that modelling the conditional distribution on training data can constrain the fitted function outside of the training support, which offers a new perspective to the challenging extrapolation problem in nonlinear regression. In particular, for `pre-additive noise' models, where noise is added to the covariates before applying a nonlinear transformation, we show that engression can successfully perform extrapolation under some assumptions such as monotonicity, whereas traditional regression approaches such as least-squares or quantile regression fall short under the same assumptions. Our empirical results, from both simulated and real data, validate the effectiveness of the engression method and indicate that the pre-additive noise model is typically suitable for many real-world scenarios. The software implementations of engression are available in both R and Python.
