Inference of rankings planted in random tournaments

Dmitriy Kunisky; Daniel A. Spielman; Xifan Yu

Inference of rankings planted in random tournaments

Dmitriy Kunisky, Daniel A. Spielman, Xifan Yu

TL;DR

This work analyzes inferring a hidden ranking from noisy pairwise comparisons encoded as a random tournament with edge directions biased by a hidden permutation $\pi$ and signal $\gamma$. It establishes sharp detection and recovery thresholds: strong detection occurs at $\gamma=\omega(n^{-3/4})$ and strong recovery at $\gamma=\omega(n^{-1/2})$, revealing a detection-recovery gap, and shows a simple Ranking By Wins algorithm achieves optimal or near-optimal performance in recovery and alignment for the planted model. The authors connect alignment maximization to maximum likelihood estimation, showing that the MLE corresponds to the alignment objective but is NP-hard in general, while Ranking By Wins provides a near-ML solution in the planted regime and achieves a $(1-o(1))$-approximation to maximum alignment for $\gamma=\omega(n^{-1/2})$. Methodologically, the paper develops low-degree polynomial detectors, spectral comparisons, Fourier-analytic tools, and Berry–Esseen style concentration to rigorously bound detection and recovery, complemented by information-theoretic lower bounds via KL divergences. The results deepen understanding of when efficient algorithms can reliably recover latent rankings from noisy comparisons and illustrate a clear detection-recovery separation in high-dimensional statistics for ranking problems.

Abstract

We consider the problem of inferring an unknown ranking of $n$ items from a random tournament on $n$ vertices whose edge directions are correlated with the ranking. We establish, in terms of the strength of these correlations, the computational and statistical thresholds for detection (deciding whether an observed tournament is purely random or drawn correlated with a hidden ranking) and recovery (estimating the hidden ranking with small error in Spearman's footrule or Kendall's tau metric on permutations). Notably, we find that this problem provides a new instance of a detection-recovery gap: solving the detection problem requires much weaker correlations than solving the recovery problem. In establishing these thresholds, we also identify simple algorithms for detection (thresholding a degree 2 polynomial) and recovery (outputting a ranking by the number of "wins" of a tournament vertex, i.e., the out-degree) that achieve optimal performance up to constants in the correlation strength. For detection, we find that the above low-degree polynomial algorithm is superior to a natural spectral algorithm. We also find that, whenever it is possible to achieve strong recovery (i.e., to estimate with vanishing error in the above metrics) of the hidden ranking, then the above "Ranking By Wins" algorithm not only does so, but also outputs a close approximation of the maximum likelihood estimator, a task that is NP-hard in the worst case.

Inference of rankings planted in random tournaments

TL;DR

This work analyzes inferring a hidden ranking from noisy pairwise comparisons encoded as a random tournament with edge directions biased by a hidden permutation

and signal

. It establishes sharp detection and recovery thresholds: strong detection occurs at

and strong recovery at

, revealing a detection-recovery gap, and shows a simple Ranking By Wins algorithm achieves optimal or near-optimal performance in recovery and alignment for the planted model. The authors connect alignment maximization to maximum likelihood estimation, showing that the MLE corresponds to the alignment objective but is NP-hard in general, while Ranking By Wins provides a near-ML solution in the planted regime and achieves a

-approximation to maximum alignment for

. Methodologically, the paper develops low-degree polynomial detectors, spectral comparisons, Fourier-analytic tools, and Berry–Esseen style concentration to rigorously bound detection and recovery, complemented by information-theoretic lower bounds via KL divergences. The results deepen understanding of when efficient algorithms can reliably recover latent rankings from noisy comparisons and illustrate a clear detection-recovery separation in high-dimensional statistics for ranking problems.

Abstract

We consider the problem of inferring an unknown ranking of

items from a random tournament on

vertices whose edge directions are correlated with the ranking. We establish, in terms of the strength of these correlations, the computational and statistical thresholds for detection (deciding whether an observed tournament is purely random or drawn correlated with a hidden ranking) and recovery (estimating the hidden ranking with small error in Spearman's footrule or Kendall's tau metric on permutations). Notably, we find that this problem provides a new instance of a detection-recovery gap: solving the detection problem requires much weaker correlations than solving the recovery problem. In establishing these thresholds, we also identify simple algorithms for detection (thresholding a degree 2 polynomial) and recovery (outputting a ranking by the number of "wins" of a tournament vertex, i.e., the out-degree) that achieve optimal performance up to constants in the correlation strength. For detection, we find that the above low-degree polynomial algorithm is superior to a natural spectral algorithm. We also find that, whenever it is possible to achieve strong recovery (i.e., to estimate with vanishing error in the above metrics) of the hidden ranking, then the above "Ranking By Wins" algorithm not only does so, but also outputs a close approximation of the maximum likelihood estimator, a task that is NP-hard in the worst case.

Paper Structure (19 sections, 20 theorems, 97 equations)

This paper contains 19 sections, 20 theorems, 97 equations.

Introduction
Main Contributions
Related Works
Main Results
Notations
Detection
Recovery
Alignment Maximization and Maximum Likelihood Estimation
Proofs of Detection Thresholds
Preliminaries
Information-Theoretic Impossibility of Detection
Low-Degree Detection Algorithm
Suboptimality of Spectral Detection Algorithm
Tools for Analysis of Ranking By Wins Algorithm
Proofs of Recovery Thresholds
...and 4 more sections

Key Result

Theorem 2.2

The following hold:

Theorems & Definitions (48)

Definition 2.1: Weak and strong detection
Theorem 2.2: Detection thresholds
Theorem 2.3: Spectral detection thresholds
Remark 2.4
Definition 2.5: Kendall tau metric
Remark 2.6: Spearman's footrule distance
Definition 2.7: Weak and strong recovery
Theorem 2.8: Recovery thresholds
Remark 2.9
Remark 2.10
...and 38 more

Inference of rankings planted in random tournaments

TL;DR

Abstract

Inference of rankings planted in random tournaments

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (48)