Rate-Optimal Rank Aggregation with Private Pairwise Rankings

Shirong Xu; Will Wei Sun; Guang Cheng

Rate-Optimal Rank Aggregation with Private Pairwise Rankings

Shirong Xu, Will Wei Sun, Guang Cheng

TL;DR

This work tackles privacy-preserving rank aggregation from pairwise rankings by modeling users under a linear stochastic transitivity framework and showing that standard randomized response perturbs violate the underlying model. It introduces Adaptive Debiased RR (ADRR), combining a debiasing step with adaptive weights to preserve unbiased estimates while accommodating heterogeneous local DP budgets. The authors establish minimax-optimal estimation rates for the true preference vector under both Bradley–Terry–Luce and general pairwise models, and derive exponential convergence results for top-$K$ and full ranking recovery as the number of items and respondents grows. Empirical results, including simulations and a real car preference dataset, demonstrate substantial utility gains of ADRR over classic RR and perturbation-based methods, highlighting its practical impact for privacy-aware ranking systems and downstream decision tasks.

Abstract

In various real-world scenarios, such as recommender systems and political surveys, pairwise rankings are commonly collected and utilized for rank aggregation to derive an overall ranking of items. However, preference rankings can reveal individuals' personal preferences, highlighting the need to protect them from exposure in downstream analysis. In this paper, we address the challenge of preserving privacy while ensuring the utility of rank aggregation based on pairwise rankings generated from a general comparison model. A common privacy protection strategy in practice is the use of the randomized response mechanism to perturb raw pairwise rankings. However, a critical challenge arises because the privatized rankings no longer adhere to the original model, resulting in significant bias in downstream rank aggregation tasks. To address this, we propose an adaptive debiasing method for rankings from the randomized response mechanism, ensuring consistent estimation of true preferences and enhancing the utility of downstream rank aggregation. Theoretically, we provide insights into the relationship between overall privacy guarantees and estimation errors in private ranking data, and establish minimax rates for estimation errors. This enables the determination of optimal privacy guarantees that balance consistency in rank aggregation with privacy protection. We also investigate convergence rates of expected ranking errors for partial and full ranking recovery, quantifying how privacy protection affects the specification of top-$K$ item sets and complete rankings. Our findings are validated through extensive simulations and a real-world application.

Rate-Optimal Rank Aggregation with Private Pairwise Rankings

TL;DR

and full ranking recovery as the number of items and respondents grows. Empirical results, including simulations and a real car preference dataset, demonstrate substantial utility gains of ADRR over classic RR and perturbation-based methods, highlighting its practical impact for privacy-aware ranking systems and downstream decision tasks.

Abstract

item sets and complete rankings. Our findings are validated through extensive simulations and a real-world application.

Paper Structure (28 sections, 228 equations, 14 figures, 11 tables)

This paper contains 28 sections, 228 equations, 14 figures, 11 tables.

Introduction
Related Work
Paper Organization
Notation
Preliminaries on Pairwise Ranking Model
Differentially Private Pairwise Rankings
Infeasibility of Classic RR Mechanism
Adaptive Debiased RR Mechanism
Differentially Private Rank Aggregation
Parameter Estimation
BTL model
General Pairwise Comparison Model
Top-$K$ Ranking Recovery
Full Ranking Recovery
Numerical Experiments
...and 13 more sections

Figures (14)

Figure 1: Utility-Preserving Private Pairwise Ranking Mechanism.
Figure 2: The process of data collection, privacy protection, and rank aggregation.
Figure 3: The boxplots of the smallest non-zero eigenvalues of cases $(L,p,\epsilon)=(30,0.8,2)$ (Left) and $(m,L,\epsilon)=(30,30,2)$ (Right) under the TM model with $\bm{\theta}$ being equally spaced in $[0, 1]$, showing that $\Lambda_{min,\perp}\left(\nabla^2\mathcal{L}_{0}(\bm\theta)\right)$ exhibits a linear relationship with $m$ and $p$.
Figure 4: The averaged estimation errors $m^{-\frac{1}{2}} \Vert \widehat{\bm{\theta}} - \bm{\theta}^\star \Vert_2$ (Left) and $\Vert \widehat{\bm{\theta}} - \bm{\theta}^\star \Vert_\infty$ (Right) of all cases under both the BTL and TM models in Scenario I.
Figure 5: The linear models between estimation errors and $\frac{1}{\sqrt{B(\bm{\epsilon})}}$ in Scenario II. Here, the dots represent the estimation errors of 200 replicates, and the shaded areas represent the corresponding 95% prediction intervals.
...and 9 more figures

Rate-Optimal Rank Aggregation with Private Pairwise Rankings

TL;DR

Abstract

Rate-Optimal Rank Aggregation with Private Pairwise Rankings

Authors

TL;DR

Abstract

Table of Contents

Figures (14)