Rate-Optimal Rank Aggregation with Private Pairwise Rankings
Shirong Xu, Will Wei Sun, Guang Cheng
TL;DR
This work tackles privacy-preserving rank aggregation from pairwise rankings by modeling users under a linear stochastic transitivity framework and showing that standard randomized response perturbs violate the underlying model. It introduces Adaptive Debiased RR (ADRR), combining a debiasing step with adaptive weights to preserve unbiased estimates while accommodating heterogeneous local DP budgets. The authors establish minimax-optimal estimation rates for the true preference vector under both Bradley–Terry–Luce and general pairwise models, and derive exponential convergence results for top-$K$ and full ranking recovery as the number of items and respondents grows. Empirical results, including simulations and a real car preference dataset, demonstrate substantial utility gains of ADRR over classic RR and perturbation-based methods, highlighting its practical impact for privacy-aware ranking systems and downstream decision tasks.
Abstract
In various real-world scenarios, such as recommender systems and political surveys, pairwise rankings are commonly collected and utilized for rank aggregation to derive an overall ranking of items. However, preference rankings can reveal individuals' personal preferences, highlighting the need to protect them from exposure in downstream analysis. In this paper, we address the challenge of preserving privacy while ensuring the utility of rank aggregation based on pairwise rankings generated from a general comparison model. A common privacy protection strategy in practice is the use of the randomized response mechanism to perturb raw pairwise rankings. However, a critical challenge arises because the privatized rankings no longer adhere to the original model, resulting in significant bias in downstream rank aggregation tasks. To address this, we propose an adaptive debiasing method for rankings from the randomized response mechanism, ensuring consistent estimation of true preferences and enhancing the utility of downstream rank aggregation. Theoretically, we provide insights into the relationship between overall privacy guarantees and estimation errors in private ranking data, and establish minimax rates for estimation errors. This enables the determination of optimal privacy guarantees that balance consistency in rank aggregation with privacy protection. We also investigate convergence rates of expected ranking errors for partial and full ranking recovery, quantifying how privacy protection affects the specification of top-$K$ item sets and complete rankings. Our findings are validated through extensive simulations and a real-world application.
