Distributional regression with reject option
Ahmed Zaoui, Clément Dombry
TL;DR
The paper addresses the challenge of making reliable distributional predictions when uncertainty is high by introducing a reject option for distributional regression with a fixed rejection rate. It derives an optimal rule based on thresholding the entropy of the CRPS, and proposes a semi-supervised plug-in procedure that uses labeled data to estimate the conditional distribution and unlabeled data to calibrate the rejection threshold, achieving distribution-free control of the rejection rate. The authors establish consistency and convergence rates, including non-asymptotic bounds and rate results for distributional KNN, and validate the approach on real datasets showing that better calibration of the entropy threshold improves predictive risk. The work contributes a flexible, practically effective framework for selective distributional prediction with strong theoretical guarantees and empirical support, enabling robust decision-making under uncertainty.
Abstract
Selective prediction, where a model has the option to abstain from making a decision, is crucial for machine learning applications in which mistakes are costly. In this work, we focus on distributional regression and introduce a framework that enables the model to abstain from estimation in situations of high uncertainty. We refer to this approach as distributional regression with reject option, inspired by similar concepts in classification and regression with reject option. We study the scenario where the rejection rate is fixed. We derive a closed-form expression for the optimal rule, which relies on thresholding the entropy function of the Continuous Ranked Probability Score (CRPS). We propose a semi-supervised estimation procedure for the optimal rule, using two datasets: the first, labeled, is used to estimate both the conditional distribution function and the entropy function of the CRPS, while the second, unlabeled, is employed to calibrate the desired rejection rate. Notably, the control of the rejection rate is distribution-free. Under mild conditions, we show that our procedure is asymptotically as effective as the optimal rule, both in terms of error rate and rejection rate. Additionally, we establish rates of convergence for our approach based on distributional k-nearest neighbor. A numerical analysis on real-world datasets demonstrates the strong performance of our procedure
