FIVB ranking: Misstep in the right direction
Salma Tenni, Daniel Gomes de Pinho Zanco, Leszek Szczecinski
TL;DR
The paper scrutinizes the FIVB ranking algorithm that employs an explicit probabilistic model for six-level ordinal volleyball outcomes to infer team skills $\boldsymbol{\theta}$ from match results. It casts ranking as statistical inference, examines the thresholds $\boldsymbol{c}$, home-field effect $\eta$, and numerical scores $\boldsymbol{r}$, and studies parameter identification via cross-validation and numerical optimization. The authors show that while the modeling framework is a positive step, several approximations (notably the fixed $\boldsymbol{c}$, the poorly tuned $\boldsymbol{r}$, and the use of match weights) hinder optimality, and they demonstrate practical improvements by deriving better numerical scores $\tilde{\boldsymbol{r}}$, incorporating a modest home-field boost, and evaluating real-time updates with unit weights. The findings provide a principled evaluation methodology for multi-level sport rankings and offer concrete, low-cost adjustments to enhance online ranking performance, with a public repository for reproducibility.
Abstract
This work presents and evaluates the ranking algorithm that has been used by Federation Internationale de Volleyball (FIVB) since 2020. The prominent feature of the FIVB ranking is the use of the probabilistic model, which explicitly calculates the probabilities of the future matches results using the estimated teams' strengths. Such explicit modeling is new in the context of official sport rankings, especially for multi-level outcomes, and we study the optimality of its parameters using both analytical and numerical methods. We conclude that from the modeling perspective, the current thresholds fit well the data but adding the home-field advantage (HFA) would be beneficial. Regarding the algorithm itself, we explain the rationale behind the approximations currently used and show a simple method to find new parameters (numerical score) which improve the performance. We also show that the weighting of the match results is counterproductive.
