Semi-Supervised Conformal Prediction With Unlabeled Nonconformity Score
Xuanning Zhou, Hao Zeng, Xiaobo Xia, Bingyi Jing, Hongxin Wei
TL;DR
This work addresses the instability and inefficiency of conformal prediction (CP) under limited labeled calibration data by introducing SemiCP, a semi-supervised CP framework that leverages unlabeled data through a novel unlabeled nonconformity score based on Nearest Neighbor Matching (NNM). The method augments the calibration set with unlabeled scores, computing a SemiCP threshold that asymptotically matches the oracle threshold, thereby preserving coverage while reducing prediction-set size and variance. The authors provide theoretical guarantees for the unlabeled score’s distributional convergence and demonstrate extensive empirical gains across CIFAR-10/100 and ImageNet, including applicability to conditional CP and compatibility with existing CP enhancements and multiple model architectures. SemiCP is shown to be data-efficient, model-agnostic, and capable of integrating with interpolation and ClusterCP, offering a practical, scalable approach to reliable uncertainty quantification in real-world scenarios.
Abstract
Conformal prediction (CP) is a powerful framework for uncertainty quantification, providing prediction sets with coverage guarantees when calibrated on sufficient labeled data. However, in real-world applications where labeled data is often limited, standard CP can lead to coverage deviation and output overly large prediction sets. In this paper, we extend CP to the semi-supervised setting and propose SemiCP, leveraging both labeled data and unlabeled data for calibration. Specifically, we introduce a novel nonconformity score function, NNM, designed for unlabeled data. This function selects labeled data with similar pseudo-label scores to estimate nonconformity scores, integrating them into the calibration process to overcome sample size limitations. We theoretically demonstrate that, under mild assumptions, SemiCP provide asymptotically coverage guarantee for prediction sets. Extensive experiments further validate that our approach effectively reduces instability and inefficiency under limited calibration data, can be adapted to conditional coverage settings, and integrates seamlessly with existing CP methods.
