Table of Contents
Fetching ...

The Role of Sequence Information in Minimal Models of Molecular Assembly

Jeremy Guntoro, Thomas Ouldridge

TL;DR

The paper investigates how sequence information and geometric constraints govern deterministic self-assembly in two aTAM-based models: backboned aTAM, which enforces neighbor growth along a fixed backbone, and sequenced aTAM, which uses a fixed tile sequence without adjacency constraints. It proves a finite universal assembly kit exists for the backboned model, underscoring the role of geometry in enabling efficient information use, while showing no universal kit exists for the sequenced model, revealing fundamental limitations of sequence-only strategies. By analyzing shape spaces and Kolmogorov complexity, the study links combinatorial growth (self-avoiding walks vs polyominoes) to assembly efficiency and demonstrates that backbone constraints significantly reduce the necessary tile diversity for large targets. Overall, the work suggests that physical geometric constraints are crucial for translating sequence programs into reliable, scalable molecular assembly, with implications for designing artificial folding systems.

Abstract

Sequence-directed assembly processes - such as protein folding - allow the assembly of a large number of structures with high accuracy from only a small handful of fundamental building blocks. We aim to explore how efficiently sequence information can be used to direct assembly by studying variants of the temperature-1 abstract tile assembly model (aTAM). We ask whether, for each variant, their exists a finite set of tile types that can deterministically assemble any shape producible by a given assembly model; we call such tile type sets "universal assembly kits". Our first model, which we call the "backboned aTAM", generates backbone-assisted assembly by forcing tiles to be added to lattice positions neighbouring the immediately preceding tile, using a predetermined sequence of tile types. We demonstrate the existence of universal assembly kit for the backboned aTAM, and show that the existence of this set is maintained even under stringent restrictions to the rules of assembly. We compare these results to a less constrained model that we call sequenced aTAM, which also uses a predetermined sequence of tiles, but does not constrain a tile to neighbour the immediately preceding tiles. We prove that this model has no universal assembly kit in the stringent case. The lack of such a kit is surprising, given that the number of tile sequences of length N scales faster than both the number and worst-case Kolmogorov complexity of producible shapes of size N for a sufficiently large - but finite - set of tiles. Our results demonstrate the importance of physical mechanisms, and specifically geometric constraints, in facilitating efficient use of the information in molecular programs for structure assembly.

The Role of Sequence Information in Minimal Models of Molecular Assembly

TL;DR

The paper investigates how sequence information and geometric constraints govern deterministic self-assembly in two aTAM-based models: backboned aTAM, which enforces neighbor growth along a fixed backbone, and sequenced aTAM, which uses a fixed tile sequence without adjacency constraints. It proves a finite universal assembly kit exists for the backboned model, underscoring the role of geometry in enabling efficient information use, while showing no universal kit exists for the sequenced model, revealing fundamental limitations of sequence-only strategies. By analyzing shape spaces and Kolmogorov complexity, the study links combinatorial growth (self-avoiding walks vs polyominoes) to assembly efficiency and demonstrates that backbone constraints significantly reduce the necessary tile diversity for large targets. Overall, the work suggests that physical geometric constraints are crucial for translating sequence programs into reliable, scalable molecular assembly, with implications for designing artificial folding systems.

Abstract

Sequence-directed assembly processes - such as protein folding - allow the assembly of a large number of structures with high accuracy from only a small handful of fundamental building blocks. We aim to explore how efficiently sequence information can be used to direct assembly by studying variants of the temperature-1 abstract tile assembly model (aTAM). We ask whether, for each variant, their exists a finite set of tile types that can deterministically assemble any shape producible by a given assembly model; we call such tile type sets "universal assembly kits". Our first model, which we call the "backboned aTAM", generates backbone-assisted assembly by forcing tiles to be added to lattice positions neighbouring the immediately preceding tile, using a predetermined sequence of tile types. We demonstrate the existence of universal assembly kit for the backboned aTAM, and show that the existence of this set is maintained even under stringent restrictions to the rules of assembly. We compare these results to a less constrained model that we call sequenced aTAM, which also uses a predetermined sequence of tiles, but does not constrain a tile to neighbour the immediately preceding tiles. We prove that this model has no universal assembly kit in the stringent case. The lack of such a kit is surprising, given that the number of tile sequences of length N scales faster than both the number and worst-case Kolmogorov complexity of producible shapes of size N for a sufficiently large - but finite - set of tiles. Our results demonstrate the importance of physical mechanisms, and specifically geometric constraints, in facilitating efficient use of the information in molecular programs for structure assembly.
Paper Structure (12 sections, 16 theorems, 4 equations, 20 figures, 2 algorithms)

This paper contains 12 sections, 16 theorems, 4 equations, 20 figures, 2 algorithms.

Key Result

Theorem 1

A universal assembly kit for the backboned aTAM exists.

Figures (20)

  • Figure 1: A diagram illustrating aTAM assembly. The fundamental building blocks of assembly are square tiles with numbered faces. At each step, a tile drawing from the assigned set of tile types (bottom) is added to a random position neighbouring an existing tile, with the possible locations being restricted to those where the resulting sum of strengths of interactions formed is greater than or equal 1 (for temperature-1 aTAM). Here, the glue pair $(1,2)$ are predetermined to have strength of 1. Note that we use a variant of the aTAM in which tiles can be rotated. Example assembly steps steps starting from the state at the top left of the figure are given in top centre and top right.
  • Figure 2: A figure illustrating assembly via the backboned aTAM and the sequenced aTAM. Consider an instance $(A_{empty},Q,g)$ of either the backboned aTAM or the sequenced aTAM, and where $g(x,\underline{x}) = 1$. An example sequence $Q$ is provided in the bottom left, with the letters of each tile type corresponding to tiles found in the bottom right. For a backboned aTAM instance, added tiles must neighbour the tile added in the immediately preceding step, and hence only the top left configuration can result from the backboned aTAM. By contrast, the sequenced aTAM has no such restriction, and both top left and top right configurations can be the final configuration in a trajectory of a sequenced aTAM instance.
  • Figure 3: A figure illustrating a finite set of "directed" tiles that comprise a universal assembly kit of the backboned aTAM (top), as well as an example configuration utilizing these tiles (bottom).
  • Figure 4: A figure illustrating the difficulties associated with assembly with only attractive (i.e. without neutral) inter-tile interactions. While in principle one can replace the neutral interface type $0$ with a self-attractive interface type $3$, such attractive interfaces can pull tiles towards unintended positions (right).
  • Figure 5: Example interacting directed tiles, with red/brown glue types representing $\sigma$, $\sigma'$ or $\sigma"$ and black glue types representing backbone faces.
  • ...and 15 more figures

Theorems & Definitions (36)

  • Theorem 1
  • Theorem 2
  • Theorem 4
  • Definition 1
  • Definition 2
  • Definition 3
  • Lemma 1
  • proof
  • Theorem 1
  • proof
  • ...and 26 more