Table of Contents
Fetching ...

Fast and Versatile RNA Design via Motif-level Divide-and-Conquer and Structure-level Rival Search

Tianshuo Zhou, David H. Mathews, Liang Huang

TL;DR

This work introduces a fast and versatile RNA design algorithm inspired by previous work on the undesignability of RNA structures and motifs, excelling in both ensemble- and MFE-based metrics.

Abstract

RNA design aims to identify RNA sequences that fold into a target secondary structure. This task is challenging in terms of computational efficiency. Most existing methods focus on either minimum free energy (MFE)-based or ensemble-based metrics, leaving a gap for a unified approach that performs well across both. We introduce a fast and versatile RNA design algorithm inspired by our previous work on the undesignability of RNA structures and motifs (i.e., sets of contiguous structural loops). Our approach decomposes a target structure into a tree of sub-targets where each leaf node corresponds to a motif and each internal node corresponds to a substructure. We first design partial sequences for each motif, then these partial sequences are selectively and recursively combined via the cube pruning strategy borrowed from computational linguistics, enabling effective optimization of ensemble-based metrics. Finally, a novel whole-structure rival search further refines sequences to suppress misfolded alternatives and enhance MFE-based performance. Our method is highly efficient and also achieves state-of-the-art results on native RNAsolo structures and the Eterna100 benchmark, excelling in both ensemble- and MFE-based metrics. Additionally, it substantially improves the design of long-structure benchmark derived from 16S rRNA, increasing average folding probability from 0.18 to 0.39 with an order-of-magnitude speedup, demonstrating its effectiveness and scalability. Availability: Source code and data are available at: https://github.com/shanry/FastDesign.

Fast and Versatile RNA Design via Motif-level Divide-and-Conquer and Structure-level Rival Search

TL;DR

This work introduces a fast and versatile RNA design algorithm inspired by previous work on the undesignability of RNA structures and motifs, excelling in both ensemble- and MFE-based metrics.

Abstract

RNA design aims to identify RNA sequences that fold into a target secondary structure. This task is challenging in terms of computational efficiency. Most existing methods focus on either minimum free energy (MFE)-based or ensemble-based metrics, leaving a gap for a unified approach that performs well across both. We introduce a fast and versatile RNA design algorithm inspired by our previous work on the undesignability of RNA structures and motifs (i.e., sets of contiguous structural loops). Our approach decomposes a target structure into a tree of sub-targets where each leaf node corresponds to a motif and each internal node corresponds to a substructure. We first design partial sequences for each motif, then these partial sequences are selectively and recursively combined via the cube pruning strategy borrowed from computational linguistics, enabling effective optimization of ensemble-based metrics. Finally, a novel whole-structure rival search further refines sequences to suppress misfolded alternatives and enhance MFE-based performance. Our method is highly efficient and also achieves state-of-the-art results on native RNAsolo structures and the Eterna100 benchmark, excelling in both ensemble- and MFE-based metrics. Additionally, it substantially improves the design of long-structure benchmark derived from 16S rRNA, increasing average folding probability from 0.18 to 0.39 with an order-of-magnitude speedup, demonstrating its effectiveness and scalability. Availability: Source code and data are available at: https://github.com/shanry/FastDesign.
Paper Structure (37 sections, 2 theorems, 15 equations, 16 figures, 9 tables, 8 algorithms)

This paper contains 37 sections, 2 theorems, 15 equations, 16 figures, 9 tables, 8 algorithms.

Key Result

Theorem 1

where $\delta$ quantifies the risk induced by the decomposition $\boldsymbol{{m}}\xspace\xspace = \boldsymbol{{m}}\xspace\xspace_a + \boldsymbol{{m}}\xspace\xspace_b$.

Figures (16)

  • Figure 1: Overview of our unified RNA design framework. The method first applies a divide–conquer–combine strategy to generate high-probability designs, followed by a rival-structure–guided search when the MFE criterion is not satisfied, ultimately yielding MFE designs.
  • Figure 1: Design constraint induced by $\boldsymbol{{y}}\xspace\xspace'$ in Fig. \ref{['fig:ex1']}. $I$: indices or positions; $\hat{\boldsymbol{{x}}\xspace\xspace}^1-\hat{\boldsymbol{{x}}\xspace\xspace}^9$: nucleotides on $I$.
  • Figure 2: An example of secondary structure and loops.
  • Figure 2: Comparison of RNA design methods on RNAsolo764.
  • Figure 3: A target structure is decomposed into a tree of subtargets.
  • ...and 11 more figures

Theorems & Definitions (6)

  • Theorem 1
  • proof
  • Proposition 2
  • Definition 1
  • Definition 2
  • Definition 3