Properties of the Mallows Model Depending on the Number of Alternatives: A Warning for an Experimentalist
Niclas Boehmer, Piotr Faliszewski, Sonja Kraiczy
TL;DR
The paper tackles how the Mallows distribution over rankings behaves as the number of alternatives grows, contrasting the classic dispersion parameter $\phi$ with the normalized variant $\mathrm{norm\text{-} }\phi$. It develops a rigorous asymptotic framework to compare properties under both parameterizations, derives exact and asymptotic expressions for top-choice position, pairwise comparisons, and winner probabilities, and provides theoretical results plus empirical evidence using real-world data. The key finding is that the classic Mallows model often exhibits structure that drifts with $m$, while the normalized variant maintains stable, data-aligned properties; this motivates preferring the normalized approach in experiments involving varying numbers of alternatives. The paper offers practical warnings for experiment design, parameter estimation, and generalization across different $m$, and provides publicly available code for replication.
Abstract
The Mallows model is a popular distribution for ranked data. We empirically and theoretically analyze how the properties of rankings sampled from the Mallows model change when increasing the number of alternatives. We find that real-world data behaves differently than the Mallows model, yet is in line with its recent variant proposed by Boehmer et al. [2021]. As part of our study, we issue several warnings about using the model.
