Fast Simulation of Cellular Automata by Self-Composition

Joseph Natal; Oleksiy Al-saadi

Fast Simulation of Cellular Automata by Self-Composition

Joseph Natal, Oleksiy Al-saadi

TL;DR

The paper tackles speeding up the simulation of one-dimensional, two-color cellular automata by self-composing the local rule to form a composite rule with radius $kr$. It proves that a $k$-fold composition yields a time–space tradeoff, reducing generation-time complexity from $O(n^2)$ toward $O(n^2 / \log n)$ while increasing memory usage to $O(n^2 / (\log n)^3)$, with $k$ selected via a Lambert $W$-based optimization. The key contributions include a formal construction showing $X_{2n-1}^{\mathcal{H}} = X_n^{\mathcal{G}}$, the $k$-fold composition framework, and complexity analyses alongside experimental validation on Rule $30$. The findings indicate a principled avenue to accelerate CA simulations under RAM-based models, though practical gains are bounded by memory bandwidth and hardware constraints; the work also suggests memory-layout improvements (e.g., De Bruijn sequences) for potential practical gains.

Abstract

Computing the configuration of any one-dimensional cellular automaton at generation $n$ can be accelerated by constructing and running a composite rule with a radius proportional to $\log n$. The new automaton is the original one, but with its local rule function composed with itself. Consequently, the asymptotic time complexity to compute the configuration of generation $n$ is reduced from $O(n^2)$-time to $O(n^2 / \log n)$, but with $O(n^2/(\log n)^3)$-space, demonstrating a time-memory tradeoff. Experimental results are given in the case of Rule 30.

Fast Simulation of Cellular Automata by Self-Composition

TL;DR

The paper tackles speeding up the simulation of one-dimensional, two-color cellular automata by self-composing the local rule to form a composite rule with radius

. It proves that a

-fold composition yields a time–space tradeoff, reducing generation-time complexity from

toward

while increasing memory usage to

, with

selected via a Lambert

-based optimization. The key contributions include a formal construction showing

, the

-fold composition framework, and complexity analyses alongside experimental validation on Rule

. The findings indicate a principled avenue to accelerate CA simulations under RAM-based models, though practical gains are bounded by memory bandwidth and hardware constraints; the work also suggests memory-layout improvements (e.g., De Bruijn sequences) for potential practical gains.

Abstract

Computing the configuration of any one-dimensional cellular automaton at generation

can be accelerated by constructing and running a composite rule with a radius proportional to

. The new automaton is the original one, but with its local rule function composed with itself. Consequently, the asymptotic time complexity to compute the configuration of generation

is reduced from

-time to

, but with

-space, demonstrating a time-memory tradeoff. Experimental results are given in the case of Rule 30.

Paper Structure (7 sections, 6 theorems, 13 equations, 7 figures, 1 table)

This paper contains 7 sections, 6 theorems, 13 equations, 7 figures, 1 table.

Introduction
Preliminaries
Automata Self-Composition
Experimental Results
Discussion
Appendix
Experiment code

Key Result

Lemma 3

Given a CA $\mathcal{H}$ with local rule $h$ and radius $r$, there exists a CA $\mathcal{G}$ with local rule $g$ and radius $2r$ such that for every $n \in \mathbb{N}$ we have that $X_{2n-1}^{\mathcal{H}} = X_n^{\mathcal{G}}$.

Figures (7)

Figure 1: Evolution of Rule $30$ beginning with a simple seed and its $2$-fold and $3$-fold composition with itself (see Definition \ref{['def:folding']}). Highlighted rows show a sample of equivalent configurations.
Figure 2: Constructing a larger radius automaton improves simulation speed on a given machine (Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz). The bottom panel shows the optimal radius as a function of time. It also shows the times at which a simulation at a given radius surpasses those of a smaller radius given on the vertical axis. Given a fixed amount of simulation time $t$ seconds, the optimal radius is approximated by $\lfloor 0.37 \log_2 (t) + 9.4 \rfloor$.
Figure 3: The machine in this experiment obeys an $n^2 / k$ time complexity scaling law ($k=r$) to compute the next generation, validating the use of Eq. \ref{['eq:bigequality']}. The ideal curve uses $r=1$ as a reference, so if $f(r)$ is the measured number of seconds per squared generation, the ideal is $f_{\text{ideal}}(r)=f(r=1)/r$.
Figure 4: The $27$-fold composition (4.5 petabytes) will overtake the bitwise-optimized implementation in about $\sim 60$ years on an Intel(R) Xeon(R) Gold CPU. This takes the principle of delayed gratification to its extreme. This algorithm might only be practical with special hardware. The curves were generated by extrapolating the quadratic time versus generation curves and the exponential dependence on $r$ for composite rule creation.
Figure 5: Rule 30's state transition (colored De Bruijn) diagram. The left cell in the edge rule $\{ \square,\blacksquare \} \rightarrow \{ \square,\blacksquare \}$ is read in from the cell configuration, and the right cell is written to the configuration. Red edges visually indicate this output cell is $\blacksquare$ and green edges indicate the output is $\square$. As an edge is traversed, the neighborhood is shifted left by one cell.
...and 2 more figures

Theorems & Definitions (15)

Definition 1
Definition 2
Lemma 3
proof
Definition 4
Lemma 5
proof
Lemma 6
proof
Lemma 7
...and 5 more

Fast Simulation of Cellular Automata by Self-Composition

TL;DR

Abstract

Fast Simulation of Cellular Automata by Self-Composition

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (15)