The Benefit of Being Bayesian in Online Conformal Prediction
Zhiyu Zhang, Zhou Lu, Heng Yang
TL;DR
This work addresses online conformal prediction under adversarial data by introducing a Bayesian-regularized CP algorithm that outputs confidence thresholds for multiple alpha-queries online. The key idea is to maintain a belief $P_t$ that blends a prior with the empirical past, so $r_t(\alpha)=q_\alpha(P_t)$, yielding a non-linearized FTRL-like update with provable regret $O(R\sqrt{T})$ for all $\alpha$ and robustness to monotonicity issues. The framework adapts to iid data, recovering near-ERM guarantees with dataset-conditional coverage, and connects to Dirichlet-process posterior means in its Bayesian interpretation. Extensions include a memory-efficient quantized version and a discounted variant for continual distribution shift, both supported by experiments on synthetic switching sequences and stock-price data. Overall, the approach combines the strengths of data-centric CP with Bayesian regularization to deliver robust, multi-alpha online confidence sets applicable in real-world risk assessment scenarios.
Abstract
Based on the framework of Conformal Prediction (CP), we study the online construction of confidence sets given a black-box machine learning model. By converting the target confidence levels into quantile levels, the problem can be reduced to predicting the quantiles (in hindsight) of a sequentially revealed data sequence. Two very different approaches have been studied previously: (i) Assuming the data sequence is iid or exchangeable, one could maintain the empirical distribution of the observed data as an algorithmic belief, and directly predict its quantiles. (ii) Due to the fragility of statistical assumptions, a recent trend is to consider the non-distributional, adversarial setting and apply first-order online optimization algorithms to moving quantile losses. However, it requires the oracle knowledge of the target quantile level, and suffers from a previously overlooked monotonicity issue due to the associated loss linearization. This paper presents an adaptive CP algorithm that combines their strengths. Without any statistical assumption, it is able to answer multiple arbitrary confidence level queries with low regret, while also overcoming the monotonicity issue suffered by first-order optimization baselines. Furthermore, if the data sequence is actually iid, then the same algorithm is automatically equipped with the "correct" coverage probability guarantee. To achieve such strengths, our key technical innovation is to regularize the aforementioned algorithmic belief (the empirical distribution) by a Bayesian prior, which robustifies it by simulating a non-linearized Follow the Regularized Leader (FTRL) algorithm on the output. Such a belief update backbone is shared by prediction heads targeting different confidence levels, bringing practical benefits analogous to the recently proposed concept of U-calibration (Kleinberg et al., 2023).
