Online Selective Conformal Prediction with Asymmetric Rules: A Permutation Test Approach
Mingyi Zheng, Ying Jin
TL;DR
This work introduces PEMI, a permutation-based Mondrian conformal inference framework for online selective prediction under arbitrary asymmetric selection rules. By calibrating conformity scores over a reference set of data permutations that preserve the selection event, PEMI achieves finite-sample selection-conditional coverage under exchangeability, for both full and Monte-Carlo permutation schemes; it also extends to incorporate offline data and multiple test samples. The authors provide theoretical guarantees and develop computationally efficient instantiations for covariate-based, conformal p/e-value based, and early-outcome-based selection rules, with applications to drug discovery and extensive simulation studies. The approach broadens the scope of selective conformal prediction beyond symmetric settings, reduces vacuity in prediction sets, and offers practical tools for online uncertainty quantification and FCR control under a taxonomy of selection rules.
Abstract
Selective conformal prediction aims to construct prediction sets with valid coverage for a test unit conditional on it being selected by a data-driven mechanism. While existing methods in the offline setting handle any selection mechanism that is permutation invariant to the labeled data, their extension to the online setting -- where data arrives sequentially and later decisions depend on earlier ones -- is challenged by the fact that the selection mechanism is naturally asymmetric. As such, existing methods only address a limited collection of selection mechanisms. In this paper, we propose PErmutation-based Mondrian Conformal Inference (PEMI), a general permutation-based framework for selective conformal prediction with arbitrary asymmetric selection rules. Motivated by full and Mondrian conformal prediction, PEMI identifies all permutations of the observed data (or a Monte-Carlo subset thereof) that lead to the same selection event, and calibrates a prediction set using conformity scores over this selection-preserving reference set. Under standard exchangeability conditions, our prediction sets achieve finite-sample exact selection-conditional coverage for any asymmetric selection mechanism and any prediction model. PEMI naturally incorporates additional offline labeled data, extends to selection mechanisms with multiple test samples, and achieves FCR control with fine-grained selection taxonomies. We further work out several efficient instantiations for commonly-used online selection rules, including covariate-based rules, conformal p/e-values-based procedures, and selection based on earlier outcomes. Finally, we demonstrate the efficacy of our methods across various selection rules on a real drug discovery dataset and investigate their performance via simulations.
