Test-time augmentation improves efficiency in conformal prediction
Divya Shanmugam, Helen Lu, Swami Sankaranarayanan, John Guttag
TL;DR
This work tackles the inefficiency of conformal prediction, where uncertainty sets can be overly large. It introduces test-time augmentation (TTA) into the conformal pipeline by learning an augmentation policy and an aggregation of TTA-transformed predictions, preserving exchangeability and the coverage guarantee. Empirical results across ImageNet, iNaturalist, and CUB-Birds show consistent reductions in average and class-conditional prediction-set sizes (up to 14% on average, with larger gains in harder classes) while maintaining or modestly improving coverage and adaptivity. The method is scalable, requires no model retraining, and demonstrates robustness under distribution shifts, offering a practical path toward more efficient conformal predictors in vision tasks.
Abstract
A conformal classifier produces a set of predicted classes and provides a probabilistic guarantee that the set includes the true class. Unfortunately, it is often the case that conformal classifiers produce uninformatively large sets. In this work, we show that test-time augmentation (TTA)--a technique that introduces inductive biases during inference--reduces the size of the sets produced by conformal classifiers. Our approach is flexible, computationally efficient, and effective. It can be combined with any conformal score, requires no model retraining, and reduces prediction set sizes by 10%-14% on average. We conduct an evaluation of the approach spanning three datasets, three models, two established conformal scoring methods, different guarantee strengths, and several distribution shifts to show when and why test-time augmentation is a useful addition to the conformal pipeline.
