Beyond the Norms: Detecting Prediction Errors in Regression Models
Andres Altieri, Marco Romanelli, Georg Pichler, Florence Alberge, Pablo Piantanida
TL;DR
The paper investigates detecting unreliable predictions in regression by formalizing unreliability through a discrepancy $d(\mathbf{Y}, f_{\mathcal{D}_n}(\mathbf{X}))$ exceeding a threshold $\epsilon$. It introduces data-driven detectors that estimate the discrepancy density and, crucially, a diversity-based score $\mathbb{H}(\mathbf{x})$ to distinguish reliable from unreliable inputs, including DV-Y and DV-D variants. The approach bridges baseline conditional-distribution methods with a robust, data-adaptive mechanism that compensates for estimation error, achieving superior AUROC on multiple UCI regression tasks and providing practical guidance for uncertainty quantification in safe ML systems. The work highlights the potential of learning distribution-aware detectors without requiring perfect probabilistic models and outlines connections to conformal ideas while emphasizing conditional reliability assessments. Overall, the proposed framework advances reliable regression by combining discrepancy-based definitions with diversity-driven detection to improve safety-critical decision-making.
Abstract
This paper tackles the challenge of detecting unreliable behavior in regression algorithms, which may arise from intrinsic variability (e.g., aleatoric uncertainty) or modeling errors (e.g., model uncertainty). First, we formally introduce the notion of unreliability in regression, i.e., when the output of the regressor exceeds a specified discrepancy (or error). Then, using powerful tools for probabilistic modeling, we estimate the discrepancy density, and we measure its statistical diversity using our proposed metric for statistical dissimilarity. In turn, this allows us to derive a data-driven score that expresses the uncertainty of the regression outcome. We show empirical improvements in error detection for multiple regression tasks, consistently outperforming popular baseline approaches, and contributing to the broader field of uncertainty quantification and safe machine learning systems. Our code is available at https://zenodo.org/records/11281964.
