Quantifying and Improving Adaptivity in Conformal Prediction through Input Transformations
Sooyong Jang, Insup Lee
TL;DR
This work tackles the challenge of evaluating and improving adaptivity in conformal prediction. It introduces a transformation based difficulty score to create balanced bins, and two metrics, T-CV and T-SS, to assess coverage stability and the relation between difficulty and set size. Building on this, it develops a adaptive, group-conditional conformal prediction algorithm that partitions data by estimated difficulty and learns per-group thresholds, ensuring coverage while improving predictively efficient sets. Empirical results on ImageNet and a visual acuity task show that the proposed metrics better capture adaptivity and that the algorithm yields smaller prediction sets on easy cases and larger ones on hard cases, compared to baselines. The approach provides a practical pathway to more reliable and interpretable adaptive conformal predictions in both vision and medical tasks.
Abstract
Conformal prediction constructs a set of labels instead of a single point prediction, while providing a probabilistic coverage guarantee. Beyond the coverage guarantee, adaptiveness to example difficulty is an important property. It means that the method should produce larger prediction sets for more difficult examples, and smaller ones for easier examples. Existing evaluation methods for adaptiveness typically analyze coverage rate violation or average set size across bins of examples grouped by difficulty. However, these approaches often suffer from imbalanced binning, which can lead to inaccurate estimates of coverage or set size. To address this issue, we propose a binning method that leverages input transformations to sort examples by difficulty, followed by uniform-mass binning. Building on this binning, we introduce two metrics to better evaluate adaptiveness. These metrics provide more reliable estimates of coverage rate violation and average set size due to balanced binning, leading to more accurate adaptivity assessment. Through experiments, we demonstrate that our proposed metric correlates more strongly with the desired adaptiveness property compared to existing ones. Furthermore, motivated by our findings, we propose a new adaptive prediction set algorithm that groups examples by estimated difficulty and applies group-conditional conformal prediction. This allows us to determine appropriate thresholds for each group. Experimental results on both (a) an Image Classification (ImageNet) (b) a medical task (visual acuity prediction) show that our method outperforms existing approaches according to the new metrics.
