Class conditional conformal prediction for multiple inputs by p-value aggregation
Jean-Baptiste Fermanian, Mohamed Hebiri, Joseph Salmon
TL;DR
This paper extends conformal prediction to settings where multiple observations of the same instance are available for classification, addressing the challenge of preserving class-conditional coverage while leveraging information from all inputs. It develops a rigorous p-value aggregation framework that relies on the exact joint distribution of conformal p-values, enabling efficient prediction-set construction via score-based envelopes (quantile, area, and distance-based). The authors introduce practical aggregation methods, including refined majority voting and several envelope-based scores, and validate them on synthetic mixtures and the LifeCLEF Plant Identification Task, showing improved informativeness (smaller sets) without sacrificing coverage. The work highlights the importance of exchangeability for multi-input aggregation, proposes randomized p-values to handle ties, and offers a pathway to scalable, class-conditional uncertainty quantification in citizen-science and similar multi-view classification scenarios.
Abstract
Conformal prediction methods are statistical tools designed to quantify uncertainty and generate predictive sets with guaranteed coverage probabilities. This work introduces an innovative refinement to these methods for classification tasks, specifically tailored for scenarios where multiple observations (multi-inputs) of a single instance are available at prediction time. Our approach is particularly motivated by applications in citizen science, where multiple images of the same plant or animal are captured by individuals. Our method integrates the information from each observation into conformal prediction, enabling a reduction in the size of the predicted label set while preserving the required class-conditional coverage guarantee. The approach is based on the aggregation of conformal p-values computed from each observation of a multi-input. By exploiting the exact distribution of these p-values, we propose a general aggregation framework using an abstract scoring function, encompassing many classical statistical tools. Knowledge of this distribution also enables refined versions of standard strategies, such as majority voting. We evaluate our method on simulated and real data, with a particular focus on Pl@ntNet, a prominent citizen science platform that facilitates the collection and identification of plant species through user-submitted images.
