Benford behavior resulting from stick and box fragmentation processes
Bruce Fang, Steven J. Miller
Abstract
Benford's law is the statement that in many real world data sets, the probability of having digit $d$ in base $B$ as the first digit is \log_{B}\!\left(\frac{d+1}{d}\right) for all $1 \leq d \leq B$. We sometimes refer to this as weak Benford behavior, and we say that a data set satisfies strong Benford behavior in base $B$ if the probability of having significand at most $s$ is \log_{B}\!\left(s\right) for all $1 \leq s < B$, . We examine Benford behaviors in two different probabilistic models: stick and box fragmentation models. Building on the work arXiv:1309.5603 on the single proportion stick fragmentation model, we employ combinatorial identities on multinomial coefficients to reduce the multi-proportion stick fragmentation model to the single proportion model. We then provide a necessary and sufficient condition for the lengths of the stick fragments to converge to strong Benford behavior along with a quantification of the discrepancy from uniform distribution on $[0,1]$ in terms of irrationality exponent. Then we answer a conjecture of arXiv:2304.08335 on the high-dimensional box fragmentation model. Using tools from Fourier analysis and order statistics, we prove that under some mild conditions, faces of any arbitrary dimension of the box have total volume converging to strong Benford behavior.
