Optimal and Provable Calibration in High-Dimensional Binary Classification: Angular Calibration and Platt Scaling

Yufan Li; Pragya Sur

Optimal and Provable Calibration in High-Dimensional Binary Classification: Angular Calibration and Platt Scaling

Yufan Li, Pragya Sur

TL;DR

This work develops a provably calibrated calibration framework for high-dimensional binary classification with Gaussian features. It introduces angular calibration, which interpolates between informative logits and Gaussian noise based on the angle between the estimated and true weight vectors, and proves both calibration and Bregman-optimality in the proportional regime where $n/d\to c$. It further shows that Platt scaling converges to the angular predictor under suitable conditions, providing a principled high-dimensional guarantee for a widely used method. Consistent estimation of the alignment angle via observable estimation cement the practical viability of the approach. Numerical experiments reinforce the theory, demonstrating calibration improvements and robustness across simulations and semi-real tasks, with extensions to non-Gaussian designs discussed for future work.

Abstract

We study the fundamental problem of calibrating a linear binary classifier of the form $σ(\hat{w}^\top x)$, where the feature vector $x$ is Gaussian, $σ$ is a link function, and $\hat{w}$ is an estimator of the true linear weight $w^\star$. By interpolating with a noninformative $\textit{chance classifier}$, we construct a well-calibrated predictor whose interpolation weight depends on the angle $\angle(\hat{w}, w_\star)$ between the estimator $\hat{w}$ and the true linear weight $w_\star$. We establish that this angular calibration approach is provably well-calibrated in a high-dimensional regime where the number of samples and features both diverge, at a comparable rate. The angle $\angle(\hat{w}, w_\star)$ can be consistently estimated. Furthermore, the resulting predictor is uniquely $\textit{Bregman-optimal}$, minimizing the Bregman divergence to the true label distribution within a suitable class of calibrated predictors. Our work is the first to provide a calibration strategy that satisfies both calibration and optimality properties provably in high dimensions. Additionally, we identify conditions under which a classical Platt-scaling predictor converges to our Bregman-optimal calibrated solution. Thus, Platt-scaling also inherits these desirable properties provably in high dimensions.

Optimal and Provable Calibration in High-Dimensional Binary Classification: Angular Calibration and Platt Scaling

TL;DR

. It further shows that Platt scaling converges to the angular predictor under suitable conditions, providing a principled high-dimensional guarantee for a widely used method. Consistent estimation of the alignment angle via observable estimation cement the practical viability of the approach. Numerical experiments reinforce the theory, demonstrating calibration improvements and robustness across simulations and semi-real tasks, with extensions to non-Gaussian designs discussed for future work.

Abstract

We study the fundamental problem of calibrating a linear binary classifier of the form

, where the feature vector

is Gaussian,

is a link function, and

is an estimator of the true linear weight

. By interpolating with a noninformative

, we construct a well-calibrated predictor whose interpolation weight depends on the angle

between the estimator

and the true linear weight

. We establish that this angular calibration approach is provably well-calibrated in a high-dimensional regime where the number of samples and features both diverge, at a comparable rate. The angle

can be consistently estimated. Furthermore, the resulting predictor is uniquely

, minimizing the Bregman divergence to the true label distribution within a suitable class of calibrated predictors. Our work is the first to provide a calibration strategy that satisfies both calibration and optimality properties provably in high dimensions. Additionally, we identify conditions under which a classical Platt-scaling predictor converges to our Bregman-optimal calibrated solution. Thus, Platt-scaling also inherits these desirable properties provably in high dimensions.

Optimal and Provable Calibration in High-Dimensional Binary Classification: Angular Calibration and Platt Scaling

TL;DR

Abstract

Optimal and Provable Calibration in High-Dimensional Binary Classification: Angular Calibration and Platt Scaling

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (21)