Table of Contents
Fetching ...

LOCUS: A Distribution-Free Loss-Quantile Score for Risk-Aware Predictions

Matheus Barreto, Mário de Castro, Thiago R. Ramos, Denis Valle, Rafael Izbicki

TL;DR

Locus is introduced, a distribution-free wrapper that produces a per-input loss-scale reliability score for a fixed prediction function that yields effective risk ranking and reduces large-loss frequency compared to standard heuristics.

Abstract

Modern machine learning models can be accurate on average yet still make mistakes that dominate deployment cost. We introduce Locus, a distribution-free wrapper that produces a per-input loss-scale reliability score for a fixed prediction function. Rather than quantifying uncertainty about the label, Locus models the realized loss of the prediction function using any engine that outputs a predictive distribution for the loss given an input. A simple split-calibration step turns this function into a distribution-free interpretable score that is comparable across inputs and can be read as an upper loss level. The score is useful on its own for ranking, and it can optionally be thresholded to obtain a transparent flagging rule with distribution-free control of large-loss events. Experiments across 13 regression benchmarks show that Locus yields effective risk ranking and reduces large-loss frequency compared to standard heuristics.

LOCUS: A Distribution-Free Loss-Quantile Score for Risk-Aware Predictions

TL;DR

Locus is introduced, a distribution-free wrapper that produces a per-input loss-scale reliability score for a fixed prediction function that yields effective risk ranking and reduces large-loss frequency compared to standard heuristics.

Abstract

Modern machine learning models can be accurate on average yet still make mistakes that dominate deployment cost. We introduce Locus, a distribution-free wrapper that produces a per-input loss-scale reliability score for a fixed prediction function. Rather than quantifying uncertainty about the label, Locus models the realized loss of the prediction function using any engine that outputs a predictive distribution for the loss given an input. A simple split-calibration step turns this function into a distribution-free interpretable score that is comparable across inputs and can be read as an upper loss level. The score is useful on its own for ranking, and it can optionally be thresholded to obtain a transparent flagging rule with distribution-free control of large-loss events. Experiments across 13 regression benchmarks show that Locus yields effective risk ranking and reduces large-loss frequency compared to standard heuristics.
Paper Structure (39 sections, 6 theorems, 90 equations, 1 figure, 9 tables)

This paper contains 39 sections, 6 theorems, 90 equations, 1 figure, 9 tables.

Key Result

Theorem 1

Under the construction above, if the data points are i.i.d., Moreover, if ties between the PIT scores occur with probability $0$, then

Figures (1)

  • Figure 1: Loss-centric reliability versus variance. We fit two models to the same data: linear regression (left) and $k$NN regression (right). (a) Data with the true mean $f(x)$ (black) and fitted prediction function $g(x)$ (red). (b) True conditional standard deviation $\sigma(x)$ (aleatoric proxy). (c) Realized absolute loss $Z=|g(X)-Y|$ (points) and our Locus score $U_\alpha(x)$ (curve), a calibrated estimate of the $(1-\alpha)$ loss quantile. (d) The induced accept/flag rule for this specific prediction function: accept if $U_\alpha(x)\le\tau$ (blue) and flag otherwise (red). $U_\alpha$ is interpretable in loss units and captures failures in the predicted values (e.g., linear misfit in low-variance regions) that $\sigma(x)$ can miss.

Theorems & Definitions (12)

  • Theorem 1: Marginal validity
  • Theorem 2: Asymptotic conditional coverage
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • proof : Proof of Theorem \ref{['thm:coverage']}
  • proof : Proof of Theorem \ref{['thm:loss-control-A']}
  • proof : Proof of Theorem \ref{['thm:asymp-coverage']}
  • proof : Proof of Theorem \ref{['thm:uniform-lambda-explicit-correct']}
  • Remark 1: Sharper uniform deviations
  • ...and 2 more