Transductive Conformal Inference for Full Ranking
Jean-Baptiste Fermanian, Pierre Humbert, Gilles Blanchard
TL;DR
The paper tackles the problem of quantifying uncertainty in full ranking when only the relative order of a subset of items is known. It introduces a transductive conformal prediction approach that leverages bounds on unknown conformity scores to produce marginally valid prediction sets for the ranks of all items and to control the false coverage proportion across multiple predictions. Two score paradigms, RA and VA, are developed, with theoretical envelopes and numerical Monte Carlo envelopes to bound calibration-to-test rank transfers, significantly reducing prediction-set length compared with naive bounds. Empirical results on synthetic data and real-world datasets (Yummly-10k and Anime LTR) demonstrate robust FCP control and competitive, adaptive interval lengths across state-of-the-art ranking algorithms such as RankNet and LambdaMART.
Abstract
We introduce a method based on Conformal Prediction (CP) to quantify the uncertainty of full ranking algorithms. We focus on a specific scenario where $n+m$ items are to be ranked by some ``black box'' algorithm. It is assumed that the relative (ground truth) ranking of $n$ of them is known. The objective is then to quantify the error made by the algorithm on the ranks of the $m$ new items among the total $(n+m)$. In such a setting, the true ranks of the $n$ original items in the total $(n+m)$ depend on the (unknown) true ranks of the $m$ new ones. Consequently, we have no direct access to a calibration set to apply a classical CP method. To address this challenge, we propose to construct distribution-free bounds of the unknown conformity scores using recent results on the distribution of conformal p-values. Using these scores upper bounds, we provide valid prediction sets for the rank of any item. We also control the false coverage proportion, a crucial quantity when dealing with multiple prediction sets. Finally, we empirically show on both synthetic and real data the efficiency of our CP method for state-of-the-art algorithms such as RankNet or LambdaMart.
