Attention-Based Feature Online Conformal Prediction for Time Series
Meiyi Zhu, Caili Guo, Chunyan Feng, Osvaldo Simeone
TL;DR
The paper tackles uncertainty quantification for non-stationary time series by extending online conformal prediction to operate in the learned feature space and by introducing an attention-based weighting scheme over historical observations. The proposed AFOCP framework combines feature-space nonconformity scores with data-driven, online adaptive weights to produce prediction sets that maintain long-term coverage while reducing interval lengths. The authors prove deterministic long-term coverage guarantees and show that FOCP/AFOCP achieve shorter time-averaged intervals than standard OCP under mild regularity assumptions, with formal results complemented by extensive experiments on synthetic and real-world datasets. This approach offers a practical and scalable path to reliable and efficient uncertainty quantification in non-stationary environments, with potential applicability to diverse time-series applications.
Abstract
Online conformal prediction (OCP) wraps around any pre-trained predictor to produce prediction sets with coverage guarantees that hold irrespective of temporal dependencies or distribution shifts. However, standard OCP faces two key limitations: it operates in the output space using simple nonconformity (NC) scores, and it treats all historical observations uniformly when estimating quantiles. This paper introduces attention-based feature OCP (AFOCP), which addresses both limitations through two key innovations. First, AFOCP operates in the feature space of pre-trained neural networks, leveraging learned representations to construct more compact prediction sets by concentrating on task-relevant information while suppressing nuisance variation. Second, AFOCP incorporates an attention mechanism that adaptively weights historical observations based on their relevance to the current test point, effectively handling non-stationarity and distribution shifts. We provide theoretical guarantees showing that AFOCP maintains long-term coverage while provably achieving smaller prediction intervals than standard OCP under mild regularity conditions. Extensive experiments on synthetic and real-world time series datasets demonstrate that AFOCP consistently reduces the size of prediction intervals by as much as $88\%$ as compared to OCP, while maintaining target coverage levels, validating the benefits of both feature-space calibration and attention-based adaptive weighting.
