Conformal Prediction for Long-Tailed Classification
Tiffany Ding, Jean-Baptiste Fermanian, Joseph Salmon
TL;DR
This paper tackles the challenge of uncertainty quantification in long-tailed multi-class classification by designing conformal prediction procedures that guarantee marginal coverage while balancing set size and class-conditional coverage. It introduces a macro-coverage–oriented score, PAS, and its weighted variant WPAS, combined with Standard CP to approximate optimal macro-coverage with small prediction sets; it also proposes Interp-Q to smoothly interpolate between classwise and marginal CP. The methods are evaluated on Pl@ntNet-300K and iNaturalist-2018, showing that PAS achieves Pareto-optimal trade-offs and that Interp-Q provides tunable control over the size-coverage balance, with WPAS effectively boosting tail-class coverage when desired. Overall, the approach enables practical, scalable uncertainty quantification for long-tailed domains such as biodiversity identification and rare-event detection, while preserving marginal guarantees and offering flexible control over the coverage-quality trade-off.
Abstract
Many real-world classification problems, such as plant identification, have extremely long-tailed class distributions. In order for prediction sets to be useful in such settings, they should (i) provide good class-conditional coverage, ensuring that rare classes are not systematically omitted from the prediction sets, and (ii) be a reasonable size, allowing users to easily verify candidate labels. Unfortunately, existing conformal prediction methods, when applied to the long-tailed setting, force practitioners to make a binary choice between small sets with poor class-conditional coverage or sets with very good class-conditional coverage but that are extremely large. We propose methods with guaranteed marginal coverage that smoothly trade off between set size and class-conditional coverage. First, we introduce a new conformal score function called prevalence-adjusted softmax that targets macro-coverage, a relaxed notion of class-conditional coverage. Second, we propose a new procedure that interpolates between marginal and class-conditional conformal prediction by linearly interpolating their conformal score thresholds. We demonstrate our methods on Pl@ntNet-300K and iNaturalist-2018, two long-tailed image datasets with 1,081 and 8,142 classes, respectively.
