Conformal Prediction: A Data Perspective
Xiaofan Zhou, Baiting Chen, Yu Gui, Lu Cheng
TL;DR
Conformal prediction provides distribution-free uncertainty quantification by producing calibrated prediction sets with finite-sample validity under exchangeability. This survey reframes CP from a data perspective, cataloging static and dynamic data modalities, and reviews foundational CP variants (full, split, weighted) plus conformal risk control, alongside broad evaluation metrics. It surveys CP applications across structured, unstructured, and spatio-temporal data, detailing data-specific nonconformity scores, efficiency considerations, and graph/text/image adaptations. The paper highlights open challenges in non-exchangeable settings, large-scale and streaming data, multi-modal integration, and responsible-AI implications, outlining future research directions for scalable, robust, and adaptable CP methods.
Abstract
Conformal prediction (CP), a distribution-free uncertainty quantification (UQ) framework, reliably provides valid predictive inference for black-box models. CP constructs prediction sets that contain the true output with a specified probability. However, modern data science diverse modalities, along with increasing data and model complexity, challenge traditional CP methods. These developments have spurred novel approaches to address evolving scenarios. This survey reviews the foundational concepts of CP and recent advancements from a data-centric perspective, including applications to structured, unstructured, and dynamic data. We also discuss the challenges and opportunities CP faces in large-scale data and models.
