Table of Contents
Fetching ...

Multiplicative Turing Ensembles, Pareto's Law, and Creativity

Alexander Kolpakov, Aidan Rocke

TL;DR

The paper introduces the Multiplicative Turing Ensemble (MTE), a prime-multiplier Markov process grounded in probabilistic Turing machines and encoded via Elias' $\omega$ codelength. By applying a maximum-entropy principle with energy $E(n)=\ell_\omega(n)$, it derives a canonical Gibbs prior on primes; a scaled version with $\beta>1$ yields finite moments and Pareto tails for additive gaps. It proves almost-sure convergence of time-averaged codelengths along MTE trajectories despite the chain being transient, and demonstrates that real-world code-size distributions (Debian and PyPI) are better captured by a scaled-$\omega$ prior than by the pure omega prior or a uniform baseline. The work connects algorithmic information theory, tail behavior, and empirical distributions of complexity, offering a framework to distinguish machine-driven versus human-driven complexity in practical datasets. Overall, the approach enables principled modeling of multiplicative integer dynamics with implications for complexity, coding, and the study of Pareto-like phenomena in computational contexts.

Abstract

We study integer-valued multiplicative dynamics driven by i.i.d. prime multipliers and connect their macroscopic statistics to universal codelengths. We introduce the Multiplicative Turing Ensemble (MTE) and show how it arises naturally - though not uniquely - from ensembles of probabilistic Turing machines. Our modeling principle is variational: taking Elias' Omega codelength as an energy and imposing maximum entropy constraints yields a canonical Gibbs prior on integers and, by restriction, on primes. Under mild tail assumptions, this prior induces exponential tails for log-multipliers (up to slowly varying corrections), which in turn generate Pareto tails for additive gaps. We also prove time-average laws for the Omega codelength along MTE trajectories. Empirically, on Debian and PyPI package size datasets, a scaled Omega prior achieves the lowest KL divergence against codelength histograms. Taken together, the theory-data comparison suggests a qualitative split: machine-adapted regimes (Gibbs-aligned, finite first moment) exhibit clean averaging behavior, whereas human-generated complexity appears to sit beyond this regime, with tails heavy enough to produce an unbounded first moment, and therefore no averaging of the same kind.

Multiplicative Turing Ensembles, Pareto's Law, and Creativity

TL;DR

The paper introduces the Multiplicative Turing Ensemble (MTE), a prime-multiplier Markov process grounded in probabilistic Turing machines and encoded via Elias' codelength. By applying a maximum-entropy principle with energy , it derives a canonical Gibbs prior on primes; a scaled version with yields finite moments and Pareto tails for additive gaps. It proves almost-sure convergence of time-averaged codelengths along MTE trajectories despite the chain being transient, and demonstrates that real-world code-size distributions (Debian and PyPI) are better captured by a scaled- prior than by the pure omega prior or a uniform baseline. The work connects algorithmic information theory, tail behavior, and empirical distributions of complexity, offering a framework to distinguish machine-driven versus human-driven complexity in practical datasets. Overall, the approach enables principled modeling of multiplicative integer dynamics with implications for complexity, coding, and the study of Pareto-like phenomena in computational contexts.

Abstract

We study integer-valued multiplicative dynamics driven by i.i.d. prime multipliers and connect their macroscopic statistics to universal codelengths. We introduce the Multiplicative Turing Ensemble (MTE) and show how it arises naturally - though not uniquely - from ensembles of probabilistic Turing machines. Our modeling principle is variational: taking Elias' Omega codelength as an energy and imposing maximum entropy constraints yields a canonical Gibbs prior on integers and, by restriction, on primes. Under mild tail assumptions, this prior induces exponential tails for log-multipliers (up to slowly varying corrections), which in turn generate Pareto tails for additive gaps. We also prove time-average laws for the Omega codelength along MTE trajectories. Empirically, on Debian and PyPI package size datasets, a scaled Omega prior achieves the lowest KL divergence against codelength histograms. Taken together, the theory-data comparison suggests a qualitative split: machine-adapted regimes (Gibbs-aligned, finite first moment) exhibit clean averaging behavior, whereas human-generated complexity appears to sit beyond this regime, with tails heavy enough to produce an unbounded first moment, and therefore no averaging of the same kind.

Paper Structure

This paper contains 24 sections, 9 theorems, 77 equations, 2 figures.

Key Result

Lemma 2.1

If $p_0,p_1,p_S>0$, then hence $\mu_\Pi$ in eq:mu-Pi is a well-defined probability distribution on the primes.

Figures (2)

  • Figure 1: PyPI distribution package sizes: (\ref{['fig:pypi-unscaled']}) compares the empirical distribution with the "pure" Gibbs prior and uniform baseline; (\ref{['fig:pypi-scaled']}) shows the effect of applying linear fit to the Gibbs prior.
  • Figure 2: Debian distribution package sizes: (\ref{['fig:debian-unscaled']}) contrasts the empirical distribution with the "pure" Gibbs prior and uniform baseline; (\ref{['fig:debian-scaled']}) shows the effect of applying linear fit to the Gibbs prior.

Theorems & Definitions (23)

  • Lemma 2.1: $\mu_\Pi$ well-defined and positive
  • proof
  • Proposition 2.2: Equivalence of ensemble viewpoints
  • proof
  • Definition 2.3: Multiplicative Turing Ensemble
  • Definition 2.4: Elias $\omega$ codelength
  • Remark 2.5
  • Proposition 2.6: Possible energy functions
  • proof
  • Remark 2.7
  • ...and 13 more