Empirical Measures and Strong Laws of Large Numbers in Categorical Probability
Tobias Fritz, Tomáš Gonda, Antonio Lorenzin, Paolo Perrone, Areeb Shah Mohammed
TL;DR
This work develops a categorical framework for empirical measures and laws of large numbers by introducing empirical sampling morphisms in quasi-Markov categories. It identifies two natural axioms—permutation invariance and empirical adequacy—and leverages infinite Kolmogorov products, distribution objects, and de Finetti representations to derive abstract versions of the de Finetti theorem, Glivenko–Cantelli, and the strong law of large numbers. The authors provide concrete measure-theoretic instantiations on standard Borel spaces, showing how these synthetic results recover classical theorems for real-valued variables with finite first moment. Overall, the paper offers a principled, first-principles categorical derivation of core probabilistic limit theorems and suggests pathways for further categorical treatments of ergodic-type results.
Abstract
The Glivenko-Cantelli theorem is a uniform version of the strong law of large numbers. It states that for every IID sequence of random variables, the empirical measure converges to the underlying distribution (in the sense of uniform convergence of the CDF). In this work, we provide tools to study such limits of empirical measures in categorical probability. We propose two axioms, permutation invariance and empirical adequacy, that a morphism of type $X^\mathbb{N} \to X$ should satisfy to be interpretable as taking an infinite sequence as input and producing a sample from its empirical measure as output. Since not all sequences have a well-defined empirical measure, ``such empirical sampling morphisms'' live in quasi-Markov categories, which, unlike Markov categories, allow partial morphisms. Given an empirical sampling morphism and a few other properties, we prove representability as well as abstract versions of the de Finetti theorem, the Glivenko-Cantelli theorem and the strong law of large numbers. We provide several concrete constructions of empirical sampling morphisms as partially defined Markov kernels on standard Borel spaces. Instantiating our abstract results then recovers the standard Glivenko-Cantelli theorem and the strong law of large numbers for random variables with finite first moment. Our work thus provides a joint proof of these two theorems in conjunction with the de Finetti theorem from first principles.
