Inference-time Stochastic Ranking with Risk Control
Ruocheng Guo, Jean-François Ton, Yang Liu, Hang Li
TL;DR
This work tackles fairness in learning-to-rank by addressing exposure bias in deterministic rankers and the high training cost of stochastic PL-based methods. It proposes Inference-time Stochastic Ranking with Risk Control (ISRR), which builds a Generalized Plackett-Luce (GPL) model atop pre-trained scoring functions and uses distribution-free risk control to guarantee a user-specified utility or fairness level at inference time. ISRR enables principled, finite-sample guarantees on ranking performance through calibration data, employing either p-value (HB) or UCB-based thresholds to select per-position candidate sets, and it interpolates between PL and deterministic ranking via thresholds $\bm{\lambda}$. Empirical results on Yahoo, MSLR-WEB30K, and Istella-S show that ISRR matches or exceeds the utility-fairness performance of existing stochastic methods while dramatically reducing training cost, and it provides finite-sample guarantees on the chosen metrics, making it practical for real-world deployment.
Abstract
Learning to Rank (LTR) methods are vital in online economies, affecting users and item providers. Fairness in LTR models is crucial to allocate exposure proportionally to item relevance. Widely used deterministic LTR models can lead to unfair exposure distribution, especially when items with the same relevance receive slightly different ranking scores. Stochastic LTR models, incorporating the Plackett-Luce (PL) ranking model, address fairness issues but suffer from high training cost. In addition, they cannot provide guarantees on the utility or fairness, which can lead to dramatic degraded utility when optimized for fairness. To overcome these limitations, we propose Inference-time Stochastic Ranking with Risk Control (ISRR), a novel method that performs stochastic ranking at inference time with guanranteed utility or fairness given pretrained scoring functions from deterministic or stochastic LTR models. Comprehensive experimental results on three widely adopted datasets demonstrate that our proposed method achieves utility and fairness comparable to existing stochastic ranking methods with much lower computational cost. In addition, results verify that our method provides finite-sample guarantee on utility and fairness. This advancement represents a significant contribution to the field of stochastic ranking and fair LTR with promising real-world applications.
