Table of Contents
Fetching ...

On Kolmogorov Structure Functions

Samuel Epstein

TL;DR

The paper investigates how Kolmogorov's structure function $\mathbf{H}_k(x)$ interacts with the Minimum Description Length (MDL) principle under the Independence Postulate. It derives bounds linking $\mathbf{I}(x;\mathcal{H})$ to the growth of the structure function and introduces the minimal sufficient statistic $k^*(x)$ with the bound $k^*(x) <^\log \mathbf{K}(\mathbf{K}(x)) + \mathbf{I}(x;\mathcal{H})$. It then discusses the obstacles of set-restricted variants, argues that unrestricted structure functions do not readily describe physical-world data, and highlights how IP makes many structure-function claims purely mathematical. The discussion advocates carefully restricted model classes and encodings for meaningful two-part decompositions and connects these ideas to measuring strings with long-running shortest programs.

Abstract

All strings with low mutual information with the halting sequence will have flat Kolmogorov Structure Functions, in the context of Algorithmic Statistics. Assuming the Independence Postulate, strings with non-negligible information with the halting sequence are purely mathematical constructions, and cannot be found in nature. Thus Algorithmic Statistics does not study strings in the physical world. This leads to the general thesis that two part codes require limitations as shown in the Minimum Description Length Principle. We also discuss issues with set-restricted Kolmogorov Structure Functions.

On Kolmogorov Structure Functions

TL;DR

The paper investigates how Kolmogorov's structure function interacts with the Minimum Description Length (MDL) principle under the Independence Postulate. It derives bounds linking to the growth of the structure function and introduces the minimal sufficient statistic with the bound . It then discusses the obstacles of set-restricted variants, argues that unrestricted structure functions do not readily describe physical-world data, and highlights how IP makes many structure-function claims purely mathematical. The discussion advocates carefully restricted model classes and encodings for meaningful two-part decompositions and connects these ideas to measuring strings with long-running shortest programs.

Abstract

All strings with low mutual information with the halting sequence will have flat Kolmogorov Structure Functions, in the context of Algorithmic Statistics. Assuming the Independence Postulate, strings with non-negligible information with the halting sequence are purely mathematical constructions, and cannot be found in nature. Thus Algorithmic Statistics does not study strings in the physical world. This leads to the general thesis that two part codes require limitations as shown in the Minimum Description Length Principle. We also discuss issues with set-restricted Kolmogorov Structure Functions.
Paper Structure (7 sections, 3 theorems, 9 equations, 1 figure)

This paper contains 7 sections, 3 theorems, 9 equations, 1 figure.

Key Result

Theorem 1

Figures (1)

  • Figure 1: A visual representation of the Kolmogorov Structure Function $\mathbf{H}_k(x)$. The amount of information that the halting sequence has about $x$ is $h=\mathbf{I}(x;\mathcal{H})$. Since $h$ is negligible for almost all $x$, the Kolmogorov Structure Function is almost always flat.

Theorems & Definitions (6)

  • Theorem 1: GacsTrVi01
  • proof
  • Corollary 1
  • Claim 1
  • Proposition 1
  • Claim 2