Table of Contents
Fetching ...

Spectral Ranking Inferences based on General Multiway Comparisons

Jianqing Fan, Zhipeng Lou, Weichen Wang, Mengxin Yu

TL;DR

This work addresses rank inference under general multiway comparisons where hyper-edge sizes vary and per-edge repeats may be as few as one. It develops a spectral ranking framework based on a Markov chain with transitions encoded by a weighting function $f(A_l)$, linking to Plackett-Luce and Luce's axiom, and shows that a two-step spectral method with optimal weighting achieves asymptotic efficiency equal to the MLE. The authors derive asymptotic normality for the estimated scores, construct one-sample and two-sample confidence intervals for ranks, and introduce a Gaussian multiplier bootstrap to calibrate critical values for simultaneous inference, including top-$K$ testing. They validate the approach via extensive simulations and real-data analyses on statistics journals and Netflix movie rankings, demonstrating accurate uncertainty quantification under heterogeneous, fixed and random graphs. Overall, the paper relaxes previous sampling assumptions, provides a unified inference framework for fixed and random graphs with variable edge sizes, and introduces two-sample rank testing as a novel tool in ranking problems.

Abstract

This paper studies the performance of the spectral method in the estimation and uncertainty quantification of the unobserved preference scores of compared entities in a general and more realistic setup. Specifically, the comparison graph consists of hyper-edges of possible heterogeneous sizes, and the number of comparisons can be as low as one for a given hyper-edge. Such a setting is pervasive in real applications, circumventing the need to specify the graph randomness and the restrictive homogeneous sampling assumption imposed in the commonly used Bradley-Terry-Luce (BTL) or Plackett-Luce (PL) models. Furthermore, in scenarios where the BTL or PL models are appropriate, we unravel the relationship between the spectral estimator and the Maximum Likelihood Estimator (MLE). We discover that a two-step spectral method, where we apply the optimal weighting estimated from the equal weighting vanilla spectral method, can achieve the same asymptotic efficiency as the MLE. Given the asymptotic distributions of the estimated preference scores, we also introduce a comprehensive framework to carry out both one-sample and two-sample ranking inferences, applicable to both fixed and random graph settings. It is noteworthy that this is the first time effective two-sample rank testing methods have been proposed. Finally, we substantiate our findings via comprehensive numerical simulations and subsequently apply our developed methodologies to perform statistical inferences for statistical journals and movie rankings.

Spectral Ranking Inferences based on General Multiway Comparisons

TL;DR

This work addresses rank inference under general multiway comparisons where hyper-edge sizes vary and per-edge repeats may be as few as one. It develops a spectral ranking framework based on a Markov chain with transitions encoded by a weighting function , linking to Plackett-Luce and Luce's axiom, and shows that a two-step spectral method with optimal weighting achieves asymptotic efficiency equal to the MLE. The authors derive asymptotic normality for the estimated scores, construct one-sample and two-sample confidence intervals for ranks, and introduce a Gaussian multiplier bootstrap to calibrate critical values for simultaneous inference, including top- testing. They validate the approach via extensive simulations and real-data analyses on statistics journals and Netflix movie rankings, demonstrating accurate uncertainty quantification under heterogeneous, fixed and random graphs. Overall, the paper relaxes previous sampling assumptions, provides a unified inference framework for fixed and random graphs with variable edge sizes, and introduces two-sample rank testing as a novel tool in ranking problems.

Abstract

This paper studies the performance of the spectral method in the estimation and uncertainty quantification of the unobserved preference scores of compared entities in a general and more realistic setup. Specifically, the comparison graph consists of hyper-edges of possible heterogeneous sizes, and the number of comparisons can be as low as one for a given hyper-edge. Such a setting is pervasive in real applications, circumventing the need to specify the graph randomness and the restrictive homogeneous sampling assumption imposed in the commonly used Bradley-Terry-Luce (BTL) or Plackett-Luce (PL) models. Furthermore, in scenarios where the BTL or PL models are appropriate, we unravel the relationship between the spectral estimator and the Maximum Likelihood Estimator (MLE). We discover that a two-step spectral method, where we apply the optimal weighting estimated from the equal weighting vanilla spectral method, can achieve the same asymptotic efficiency as the MLE. Given the asymptotic distributions of the estimated preference scores, we also introduce a comprehensive framework to carry out both one-sample and two-sample ranking inferences, applicable to both fixed and random graph settings. It is noteworthy that this is the first time effective two-sample rank testing methods have been proposed. Finally, we substantiate our findings via comprehensive numerical simulations and subsequently apply our developed methodologies to perform statistical inferences for statistical journals and movie rankings.
Paper Structure (44 sections, 10 theorems, 145 equations, 4 figures, 11 tables)

This paper contains 44 sections, 10 theorems, 145 equations, 4 figures, 11 tables.

Key Result

Theorem 4.1

Under Assumptions Assumption_dynamic_range_bound-ass4.2, the spectral estimator $\widetilde{\theta}_i$ has the following uniform approximation: $\widetilde{\theta}_i-\theta_i^*=J_i^*+\delta_i,$ uniformly for all $i\in[n],$where $\|\delta:=(\delta_1,\cdots,\delta_n)\|_{\infty}=o({1}/{\sqrt{n^\dagger}

Figures (4)

  • Figure 1: A simple illustration of a collection of $\{c_l, A_l\}_{\ell=1}^{D}$ with $(c_1,A_1)=(3,\{2,3,4,5\})$, $(c_2,A_2)=(2,\{1,2,3\})$, $(c_3,A_3)=(2,\{2,5\})$, $(c_4,A_4)=(4,\{4,5\})$, $(c_5,A_5)=(4,\{2,4\})$, $(c_6,A_6)=(1,\{1,4\})$, $(c_7,A_7)=(5,\{4,5\}).$ Left panel illustrates the choice sets $\{A_l\}_{\ell=1}^7,$ where all nodes inside each $A_l$ are surrounded by an open area with the same color. The right panel presents the comparison-induced Markov transition matrix, whose computation is detailed in Section \ref{['sec:mle_con']}. A directed edge from $i$ to $j,(j\neq i)$ exists if and only if $i,j,(i\neq j)$ are compared in some $A_l$ and $j$ is the winner ($c_l=j$).
  • Figure 2: $\ell_{\infty}$- statistical errors of the spectral estimator $\widetilde{\theta}$ against the theoretical rate when $|\mathcal{D}|$ varies. We let $|\mathcal{D}|$ increases such that $\sqrt{\log n/n^{\dagger}}$ takes uniform grid between $[0.05,0.20].$ The solid lines represent the averaged statistical errors of 500 repetitions and the light areas are formed by plus and minus one standard deviation curves to the average curve. The blue and yellow ones correspond to using $f(A_l) = |A_l|$ or $\sum_{i\in A_l}e^{\theta_i^*}$ (oracle weight), respectively.
  • Figure 3: Histograms for the normalized quantities $\rho_8(\widetilde{\theta})(\widetilde{\theta}_8-\theta_8^*), \rho_{20}(\widetilde{\theta})(\widetilde{\theta}_{20}-\theta_{20}^*),$ and $\rho_{30}(\widetilde{\theta})(\widetilde{\theta}_{30}-\theta_{30}^*)$. Here, $\rho_i(\widetilde{\theta})$ is utilized as an estimator of the inverse of the theoretical standard deviation of $\widetilde{\theta}_i$. The black curves denote the standard Gaussian distribution. For this analysis, the total number of comparisons, $|\mathcal{D}|$, is set to 12,000, while the rest of the settings remain consistent with those outlined earlier. We use $f(A_l)=|A_l|$ for the three plots on the first row and use $f(A_l)=\sum_{i\in A_l}e^{\theta_i^*}$ for the three plots on the second row.
  • Figure 4: PP-plot of empirical probability $\widehat{\mathbb{P}}({\mathcal{T}} > \mathcal{G}_{1-\alpha})$ of ${\mathcal{T}}$ given in \ref{['sec:onesam_twosided_inf']} with $\mathcal{M}=\{8,20,30\}$ against theoretical significance level $\alpha$. Here we choose $|\mathcal{D}|=24000.$ The red solid and blue, and green dash-dotted lines represent theoretical and empirical probabilities using the Vanilla Spectral estimator and the Oracle Spectral estimator, respectively. The purple dotted line represents the case with a significance level $0.05$.

Theorems & Definitions (24)

  • Example 1.1
  • Example 1.2
  • Remark 2.1
  • Example 3.1
  • Remark 3.1: One-sample one-sided confidence intervals
  • Remark 3.2: Ranking inference for the PL model with random comparison graph
  • Example 3.2: Testing top-$K$ placement
  • Example 3.3: Top-$K$ sure screening set
  • Example 3.4: Testing ranks of two samples
  • Example 3.5: Testing top-$K$ sets of two samples
  • ...and 14 more