Table of Contents
Fetching ...

Resolving Speed and Encoding Bottlenecks in Fast Heteromeric Self-Assembly

Félix Benoist, Pablo Sartori

TL;DR

This work presents a kinetic-encoding framework for fast, accurate self-assembly of large heteromeric structures by combining assembly factors with geometry-driven connectivity. It identifies speed and encoding bottlenecks and shows that small, targeted increases in local connectivity can dramatically suppress them, enabling rapid, faithful retrieval of a seeded target even with multiple encoded structures. The results yield explicit scaling relations for retrieval time, error rates, and encoding capacity, including combinatorial encoding Smax ~ N^{1-1/n_c} (or ~ N^{1-4/z} in certain regimes), and demonstrate pathway funneling as a mechanism to constrain growth to preferred routes. These insights illuminate how assembly factors and kinetic discrimination could underpin biological assembly of complexes like ribosomes and spliceosomes, and guide design principles for synthetic, programmable self-assembling systems.

Abstract

The cytoplasm is a heterogeneous mixture containing many types of proteins that self-assemble into a wide variety of complexes. The accuracy and speed of cytoplasmic self-assembly is astonishing because it involves the correct identification of components shared among different structures, despite pervasive thermal fluctuations. Typical toy models of self-assembly are based on the specificity of binding energies among components and neglect kinetic effects. However, kinetics plays a key role in biological self-assembly, often catalyzed by a plethora of assembly factors. Building on this observation, we extend a previous heteropolymer growth model to describe the retrieval of two-dimensional structures. We find that the self-assembly of structures in this model is subject to strong speed and encoding bottlenecks. Moreover, we show that these bottlenecks can be suppressed by increasing the connectivity of a small fraction of components. This mechanism of kinetically controlling a small number of critical binding events provides a simple explanation for the timely assembly of large protein, and suggests a unifying principle for the role of assembly factors.

Resolving Speed and Encoding Bottlenecks in Fast Heteromeric Self-Assembly

TL;DR

This work presents a kinetic-encoding framework for fast, accurate self-assembly of large heteromeric structures by combining assembly factors with geometry-driven connectivity. It identifies speed and encoding bottlenecks and shows that small, targeted increases in local connectivity can dramatically suppress them, enabling rapid, faithful retrieval of a seeded target even with multiple encoded structures. The results yield explicit scaling relations for retrieval time, error rates, and encoding capacity, including combinatorial encoding Smax ~ N^{1-1/n_c} (or ~ N^{1-4/z} in certain regimes), and demonstrate pathway funneling as a mechanism to constrain growth to preferred routes. These insights illuminate how assembly factors and kinetic discrimination could underpin biological assembly of complexes like ribosomes and spliceosomes, and guide design principles for synthetic, programmable self-assembling systems.

Abstract

The cytoplasm is a heterogeneous mixture containing many types of proteins that self-assemble into a wide variety of complexes. The accuracy and speed of cytoplasmic self-assembly is astonishing because it involves the correct identification of components shared among different structures, despite pervasive thermal fluctuations. Typical toy models of self-assembly are based on the specificity of binding energies among components and neglect kinetic effects. However, kinetics plays a key role in biological self-assembly, often catalyzed by a plethora of assembly factors. Building on this observation, we extend a previous heteropolymer growth model to describe the retrieval of two-dimensional structures. We find that the self-assembly of structures in this model is subject to strong speed and encoding bottlenecks. Moreover, we show that these bottlenecks can be suppressed by increasing the connectivity of a small fraction of components. This mechanism of kinetically controlling a small number of critical binding events provides a simple explanation for the timely assembly of large protein, and suggests a unifying principle for the role of assembly factors.

Paper Structure

This paper contains 17 sections, 27 equations, 14 figures, 2 tables.

Figures (14)

  • Figure 1: Formation of large heteromeric complexes rely on many assembly factors.A. Schematic representation of the self-assembly of a protein complex. The self-assembly of protein subunits is driven by assembly factors, that have the ultimate role of favoring speed and accuracy of complex formation. The complete protein assembly contains no assembly factor. B. Structural snapshots of assembly intermediates of the yeast ribosome large subunit. As we can see, assembly factors transiently bound to the RNA-protein assembly. Protein data bank IDs: 6EM3, 6LSR Berman00. C. Intermediates of the human proteasome (8QZ9) and of the bacterial photosystem II (7NPQ). D. In purely asymmetric complexes, numbers of assembly factors roughly increase with numbers of protein subunits. By contrast, complexes with a local rotational symmetry require fewer assembly factors. Gray lines: slope 2 and $\frac{1}{2}$. Data detailed in SI Sec. \ref{['sec.SI-Data']} and Refs. Wahl09Dorner23Woolford13Lavdovskaia24Vercellino22Heinz16Yang15Wild12Ruhle15Altegoer15Armitage20Rousseau18 therein. E. The connectivity of components in an assembly is highly variable. Highly-connected components correspond to RNAs. Detailed plot in Fig. \ref{['fig:con_detailed']}. F. Our kinetic encoding model consists in the acceleration of correct binding events via kinetic cooperativity and catalyzing assembly factors.
  • Figure 2: Kinetic encoding of multiple target structures.A. A mixture of $N=9$ component species can assemble into $S=3$ structures of size $N=\ell\times\ell$. Components bind via the black rectangles. B. Growth occurs by adding/removing monomers at the boundary locations $\mathbf x=(x,y)$ (dashed contour). C. Addition ($k_i^+$) and removal ($k_i^-$) rates depend on the number of correct partners ($r_i$) in the neighborhood $\mathcal{N}_{\bf x}$ (0 denotes empty sites). Here, component 3 binds to the 3 neighbors as in target yellow, while component 6 binds to only one of its two neighbors. Note that bonds serve to accelerate binding, rather than to stabilize it. D. Example self-assembly pathway of the yellow structure from a small seed. The acceleration of correct binding/unbinding events via $\delta$ [Eq. \ref{['eq:rates']}] is sketched as larger arrows.
  • Figure 3: Nearest-neighbor case yields slow retrieval.A. Assembly snapshots showing retrieval of a single square target of side length $\ell=14$ tiles from a two-layer nucleation seed. Assembly is allowed to grow beyond the target size $\ell$ up to $2\ell$. White denotes empty locations, yellow shows correctly placed tiles, and shades of gray denote incorrect tiles of different species. Other growth regimes are described in SI Sec. \ref{['sec.SI-Growth']}. B. The retrieval accuracy, i.e. the fraction of correctly placed tiles (line), follows the normalized assembly size (circles) over time and exhibits speed bottlenecks. C. The target retrieval time ($\tau_{\rm ret}$) from Eq. \ref{['eq:t_ret']} separates from the target lifetime ($\tau_{\rm life}$) for $\delta>\delta_{\rm min}$ [Eq. \ref{['eq:d_min']}]. D. In the irreversible, high-discrimination regime, each addition event maximizes the number of bonds. The mean waiting time is dominated by the inverse of the addition rate [Eq. \ref{['eq:rates']}], so additions with few bonds slow down the assembly.
  • Figure 4: Speed bottlenecks are also encoding bottlenecks.A. Assembly snapshots for $S=2$ arbitrary targets (yellow and red) show the formation of a chimeric structure, detailed description in SI Sec. \ref{['sec.SI-Growth']}. B. Errors make the retrieval accuracy (line) depart from the normalized assembly size (circles). C. For $S=2$ targets, the maximal accuracy at large $\delta$ is around 50%. D. Since each component has two different partners, kinetic discrimination is impossible at addition events with only one bond.
  • Figure 5: Local increases in connectivity resolve assembly bottlenecks.A. Schematic showing that adding one diagonal bond per layer enables fast and accurate retrieval of $S=2$ targets by ensuring $n_{\rm c}=2$. B. Assembly snapshots show faithful retrieval for the same settings as in Fig. \ref{['fig:Z4_S=2']}A, except that $z=4^+$. C. Retrieval time and lifetime separate at large $\delta$, as predicted. D. Accuracy, rescaled between 0 and 1, decreases sigmoidally with $S$. Inset: the sigmoid midpoint $S_{\rm max}$ shows negligible $N$-dependence [Eq. \ref{['eq:Smax_gen']}].
  • ...and 9 more figures