An additively optimal interpreter for approximating Kolmogorov prefix complexity

Zoe Leyva-Acosta; Eduardo Acuña Yeomans; Francisco Hernandez-Quiroz

An additively optimal interpreter for approximating Kolmogorov prefix complexity

Zoe Leyva-Acosta, Eduardo Acuña Yeomans, Francisco Hernandez-Quiroz

TL;DR

This work develops IMP2, a prefix-free, additively optimal interpreter built on a high-level IMP-like language, to serve as a reference machine for approximating Kolmogorov prefix complexity via the Coding Theorem Method (CTM). By enumerating IMP2 programs by length and running them on a resource-bounded interpreter, the authors produce CTM-based complexity estimates $CTM_{(n,m)}(x)=-\log_2 D_{(n,m)}(x)$ and SPF values, enabling comparison with earlier models and direct SPF-based complexity. The study finds that global CTM rankings across models can be strongly correlated, but local per-length correlations may be weak, suggesting sensitivity to the chosen model and the need for larger program spaces to reach a more natural distribution. Importantly, CTM under IMP2 exhibits a very high correlation with SPF ($\text{Spearman} \approx 0.986$, $\text{Pearson} \approx 0.911$), supporting CTM as a valid, finer-grained approximation method for algorithmic complexity, while highlighting practical limitations due to a high non-halting fraction in the prefix-free program space.

Abstract

We study practical approximations to Kolmogorov prefix complexity (K) using IMP2, a high-level programming language. Our focus is on investigating the interpreter optimality for this language as the reference machine for the Coding Theorem Method (CTM). A method advanced to deal with applications to algorithmic complexity different to the popular traditional lossless compression approach based on the principles of algorithmic probability. The chosen model of computation is proven to be suitable for this task and a comparison to other models and methods is performed. Our findings show that CTM approximations using our model do not always correlate with results from lower-level models of computation. This suggests some models may require a larger program space to converge to Levin's universal distribution. Furthermore, we compare CTM with an upper bound to Kolmogorov complexity and find a strong correlation, supporting CTM's validity as an approximation method with finer-grade resolution of K.

An additively optimal interpreter for approximating Kolmogorov prefix complexity

TL;DR

and SPF values, enabling comparison with earlier models and direct SPF-based complexity. The study finds that global CTM rankings across models can be strongly correlated, but local per-length correlations may be weak, suggesting sensitivity to the chosen model and the need for larger program spaces to reach a more natural distribution. Importantly, CTM under IMP2 exhibits a very high correlation with SPF (

), supporting CTM as a valid, finer-grained approximation method for algorithmic complexity, while highlighting practical limitations due to a high non-halting fraction in the prefix-free program space.

Abstract

Paper Structure (14 sections, 1 theorem, 11 equations, 6 figures, 3 tables)

This paper contains 14 sections, 1 theorem, 11 equations, 6 figures, 3 tables.

Introduction
Preliminaries
Methodology and techniques
From IMP to IMP2
Syntax of IMP2
Semantics of IMP2
Input/output convention
Prefix-free optimality
Enumeration of IMP2 sentences
Enumerating IMP2 programs by length
Results
On the convergence towards a 'natural' distribution
Validation of CTM by SPF
Concluding remarks

Key Result

Proposition 1

The universal machine $\text{IMP2}$ is additively optimal for the class of prefix machines.

Figures (6)

Figure 1: Context-free grammar for IMP2. $N$ stands for natural numbers without leading zeros.
Figure 2: Example of an IMP2 sentence computing $5!$ and storing the result in location $1$.
Figure 3: Example of IMP2 sentence counting consecutive $1$ bits from the input stream and storing the result in location $0$.
Figure 4: CTM approximation comparison between $\text{IMP2}_{40}$ and $(4,2)$.
Figure 5: Complexity estimations comparison for $\text{IMP2}_{40}$ for all binary strings of length $6$ and below.
...and 1 more figures

Theorems & Definitions (2)

Proposition 1
proof

An additively optimal interpreter for approximating Kolmogorov prefix complexity

TL;DR

Abstract

An additively optimal interpreter for approximating Kolmogorov prefix complexity

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (2)