Robust Conformal Prediction with a Single Binary Certificate
Soroush H. Zargarbashi, Aleksandar Bojchevski
TL;DR
This paper addresses the computational burden and inefficiency of existing robust conformal prediction methods under adversarial perturbations. It introduces BinCP, a binarized conformal prediction framework that uses a thresholded, smoothed score and a single binary certificate to guarantee coverage even within adversarial perturbation balls. The method provides closed-form or efficiently computable bounds for common smoothing schemes, supports finite-sample corrections via Clopper-Pearson intervals, and can employ de-randomized certificates to further reduce sampling needs. Empirically, BinCP delivers smaller robust prediction sets with dramatically fewer Monte-Carlo samples across CIFAR-10, ImageNet, and Cora-ML, while maintaining the formal guarantees. Overall, BinCP offers a practical, model-agnostic, black-box approach to robust uncertainty quantification with substantial efficiency gains and broad applicability.
Abstract
Conformal prediction (CP) converts any model's output to prediction sets with a guarantee to cover the true label with (adjustable) high probability. Robust CP extends this guarantee to worst-case (adversarial) inputs. Existing baselines achieve robustness by bounding randomly smoothed conformity scores. In practice, they need expensive Monte-Carlo (MC) sampling (e.g. $\sim10^4$ samples per point) to maintain an acceptable set size. We propose a robust conformal prediction that produces smaller sets even with significantly lower MC samples (e.g. 150 for CIFAR10). Our approach binarizes samples with an adjustable (or automatically adjusted) threshold selected to preserve the coverage guarantee. Remarkably, we prove that robustness can be achieved by computing only one binary certificate, unlike previous methods that certify each calibration (or test) point. Thus, our method is faster and returns smaller robust sets. We also eliminate a previous limitation that requires a bounded score function.
