Universal emergence of local Zipf-Mandelbrot law
Davide Cugini, André Timpanaro, Giacomo Livan, Giacomo Guarnieri
TL;DR
The paper addresses the ubiquity of Zipf-Mandelbrot law by deriving a general analytic relation between order statistics and the parent distribution. It introduces Order Duality Relationships (ODRs) that give closed-form expressions for the means and covariances of order statistics and show a concentration of measure as $N\to\infty$. Using ODRs, it proves that a small subset of ranked observations exhibits Zipf-Mandelbrot behavior with a locally defined exponent $\alpha(\lambda_0) = -1/(z^{(1)}(H_0)+1)$, where $H_0 = \ln\langle X(\lambda_0)\rangle$. The authors validate the theory on Miller's typing monkey, Barabasi-Albert networks, and Gaussian data, illustrating both global and local emergence of ZM and providing a rigorous criterion for when a distribution should be read as a power law. The results offer a unifying framework for interpreting rank-frequency patterns across disciplines and guide when local ZM is expected in large datasets.
Abstract
A plethora of natural and socio-economic phenomena share a striking statistical regularity, that is the magnitude of elements decreases with a power law as a function of their position in a ranking of magnitude. Such regularity is known as Zipf-Mandelbrot law (ZM), and plenty of problem-specific explanations for its emergence have been provided in different fields. Yet, an explanation for ZM ubiquity is currently lacking. In this paper we first provide an analytical expression for the cumulants of any ranked sample of i.i.d. random variables once sorted in decreasing order. Then we make use of this result to rigorously demonstrate that, whenever a small fraction of such ranked dataset is considered, it becomes statistically indistinguishable from a ZM law. We finally validate our results against several relevant examples.
