Table of Contents
Fetching ...

Fast Simulation of Cellular Automata by Self-Composition

Joseph Natal, Oleksiy Al-saadi

TL;DR

The paper tackles speeding up the simulation of one-dimensional, two-color cellular automata by self-composing the local rule to form a composite rule with radius $kr$. It proves that a $k$-fold composition yields a time–space tradeoff, reducing generation-time complexity from $O(n^2)$ toward $O(n^2 / \log n)$ while increasing memory usage to $O(n^2 / (\log n)^3)$, with $k$ selected via a Lambert $W$-based optimization. The key contributions include a formal construction showing $X_{2n-1}^{\mathcal{H}} = X_n^{\mathcal{G}}$, the $k$-fold composition framework, and complexity analyses alongside experimental validation on Rule $30$. The findings indicate a principled avenue to accelerate CA simulations under RAM-based models, though practical gains are bounded by memory bandwidth and hardware constraints; the work also suggests memory-layout improvements (e.g., De Bruijn sequences) for potential practical gains.

Abstract

Computing the configuration of any one-dimensional cellular automaton at generation $n$ can be accelerated by constructing and running a composite rule with a radius proportional to $\log n$. The new automaton is the original one, but with its local rule function composed with itself. Consequently, the asymptotic time complexity to compute the configuration of generation $n$ is reduced from $O(n^2)$-time to $O(n^2 / \log n)$, but with $O(n^2/(\log n)^3)$-space, demonstrating a time-memory tradeoff. Experimental results are given in the case of Rule 30.

Fast Simulation of Cellular Automata by Self-Composition

TL;DR

The paper tackles speeding up the simulation of one-dimensional, two-color cellular automata by self-composing the local rule to form a composite rule with radius . It proves that a -fold composition yields a time–space tradeoff, reducing generation-time complexity from toward while increasing memory usage to , with selected via a Lambert -based optimization. The key contributions include a formal construction showing , the -fold composition framework, and complexity analyses alongside experimental validation on Rule . The findings indicate a principled avenue to accelerate CA simulations under RAM-based models, though practical gains are bounded by memory bandwidth and hardware constraints; the work also suggests memory-layout improvements (e.g., De Bruijn sequences) for potential practical gains.

Abstract

Computing the configuration of any one-dimensional cellular automaton at generation can be accelerated by constructing and running a composite rule with a radius proportional to . The new automaton is the original one, but with its local rule function composed with itself. Consequently, the asymptotic time complexity to compute the configuration of generation is reduced from -time to , but with -space, demonstrating a time-memory tradeoff. Experimental results are given in the case of Rule 30.
Paper Structure (7 sections, 6 theorems, 13 equations, 7 figures, 1 table)

This paper contains 7 sections, 6 theorems, 13 equations, 7 figures, 1 table.

Key Result

Lemma 3

Given a CA $\mathcal{H}$ with local rule $h$ and radius $r$, there exists a CA $\mathcal{G}$ with local rule $g$ and radius $2r$ such that for every $n \in \mathbb{N}$ we have that $X_{2n-1}^{\mathcal{H}} = X_n^{\mathcal{G}}$.

Figures (7)

  • Figure 1: Evolution of Rule $30$ beginning with a simple seed and its $2$-fold and $3$-fold composition with itself (see Definition \ref{['def:folding']}). Highlighted rows show a sample of equivalent configurations.
  • Figure 2: Constructing a larger radius automaton improves simulation speed on a given machine (Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz). The bottom panel shows the optimal radius as a function of time. It also shows the times at which a simulation at a given radius surpasses those of a smaller radius given on the vertical axis. Given a fixed amount of simulation time $t$ seconds, the optimal radius is approximated by $\lfloor 0.37 \log_2 (t) + 9.4 \rfloor$.
  • Figure 3: The machine in this experiment obeys an $n^2 / k$ time complexity scaling law ($k=r$) to compute the next generation, validating the use of Eq. \ref{['eq:bigequality']}. The ideal curve uses $r=1$ as a reference, so if $f(r)$ is the measured number of seconds per squared generation, the ideal is $f_{\text{ideal}}(r)=f(r=1)/r$.
  • Figure 4: The $27$-fold composition (4.5 petabytes) will overtake the bitwise-optimized implementation in about $\sim 60$ years on an Intel(R) Xeon(R) Gold CPU. This takes the principle of delayed gratification to its extreme. This algorithm might only be practical with special hardware. The curves were generated by extrapolating the quadratic time versus generation curves and the exponential dependence on $r$ for composite rule creation.
  • Figure 5: Rule 30's state transition (colored De Bruijn) diagram. The left cell in the edge rule $\{ \square,\blacksquare \} \rightarrow \{ \square,\blacksquare \}$ is read in from the cell configuration, and the right cell is written to the configuration. Red edges visually indicate this output cell is $\blacksquare$ and green edges indicate the output is $\square$. As an edge is traversed, the neighborhood is shifted left by one cell.
  • ...and 2 more figures

Theorems & Definitions (15)

  • Definition 1
  • Definition 2
  • Lemma 3
  • proof
  • Definition 4
  • Lemma 5
  • proof
  • Lemma 6
  • proof
  • Lemma 7
  • ...and 5 more