Shannon Information and Kolmogorov Complexity
Peter Grunwald, Paul Vitanyi
TL;DR
The paper juxtaposes Shannon entropy $H(X)$ with Kolmogorov complexity $K(x)$ to clarify their distinct aims: ensemble-based coding versus individual-object description. It shows that the expected Kolmogorov complexity under a computable distribution approximates Shannon entropy, and that probabilistic mutual information corresponds closely to its algorithmic analogue on average. By introducing the algorithmic structure function and algorithmic sufficiency, the authors connect MDL and rate-distortion theory to a purely descriptional framework, enabling model selection that balances meaningful structure with parsimony. The work also establishes that probabilistic and algorithmic notions of information obey analogous inequalities, with algorithmic quantities matching probabilistic ones up to additive (often logarithmic) terms, and it highlights practical pathways through universal coding and MDL as bridges between theory and data analysis.
Abstract
We compare the elementary theories of Shannon information and Kolmogorov complexity, the extent to which they have a common purpose, and where they are fundamentally different. We discuss and relate the basic notions of both theories: Shannon entropy versus Kolmogorov complexity, the relation of both to universal coding, Shannon mutual information versus Kolmogorov (`algorithmic') mutual information, probabilistic sufficient statistic versus algorithmic sufficient statistic (related to lossy compression in the Shannon theory versus meaningful information in the Kolmogorov theory), and rate distortion theory versus Kolmogorov's structure function. Part of the material has appeared in print before, scattered through various publications, but this is the first comprehensive systematic comparison. The last mentioned relations are new.
