An additively optimal interpreter for approximating Kolmogorov prefix complexity
Zoe Leyva-Acosta, Eduardo Acuña Yeomans, Francisco Hernandez-Quiroz
TL;DR
This work develops IMP2, a prefix-free, additively optimal interpreter built on a high-level IMP-like language, to serve as a reference machine for approximating Kolmogorov prefix complexity via the Coding Theorem Method (CTM). By enumerating IMP2 programs by length and running them on a resource-bounded interpreter, the authors produce CTM-based complexity estimates $CTM_{(n,m)}(x)=-\log_2 D_{(n,m)}(x)$ and SPF values, enabling comparison with earlier models and direct SPF-based complexity. The study finds that global CTM rankings across models can be strongly correlated, but local per-length correlations may be weak, suggesting sensitivity to the chosen model and the need for larger program spaces to reach a more natural distribution. Importantly, CTM under IMP2 exhibits a very high correlation with SPF ($\text{Spearman} \approx 0.986$, $\text{Pearson} \approx 0.911$), supporting CTM as a valid, finer-grained approximation method for algorithmic complexity, while highlighting practical limitations due to a high non-halting fraction in the prefix-free program space.
Abstract
We study practical approximations to Kolmogorov prefix complexity (K) using IMP2, a high-level programming language. Our focus is on investigating the interpreter optimality for this language as the reference machine for the Coding Theorem Method (CTM). A method advanced to deal with applications to algorithmic complexity different to the popular traditional lossless compression approach based on the principles of algorithmic probability. The chosen model of computation is proven to be suitable for this task and a comparison to other models and methods is performed. Our findings show that CTM approximations using our model do not always correlate with results from lower-level models of computation. This suggests some models may require a larger program space to converge to Levin's universal distribution. Furthermore, we compare CTM with an upper bound to Kolmogorov complexity and find a strong correlation, supporting CTM's validity as an approximation method with finer-grade resolution of K.
