Table of Contents
Fetching ...

gRNAde: Geometric Deep Learning for 3D RNA inverse design

Chaitanya K. Joshi, Arian R. Jamasb, Ramon Viñas, Charles Harris, Simon V. Mathis, Alex Morehead, Rishabh Anand, Pietro Liò

TL;DR

gRNAde tackles 3D RNA inverse design by modeling conformational ensembles with a multi-state SE(3)-equivariant GNN and autoregressive decoding to generate backbone-conditioned sequences. It introduces a geometric multi-graph representation, state-wise GNN encoding, and state-invariant pooling to produce sequences that respect 3D structure and dynamics. The approach yields higher native sequence recovery and much faster inference than Rosetta on single-state benchmarks, and extends to multi-state design with improved performance in flexible regions, plus zero-shot ranking of fitness landscapes. Wet-lab validation via OpenKnot Round 6 shows competitive OpenKnot scores and a higher success rate than Rosetta, underscoring practical utility and generalizability to real experimental workflows.

Abstract

Computational RNA design tasks are often posed as inverse problems, where sequences are designed based on adopting a single desired secondary structure without considering 3D conformational diversity. We introduce gRNAde, a geometric RNA design pipeline operating on 3D RNA backbones to design sequences that explicitly account for structure and dynamics. gRNAde uses a multi-state Graph Neural Network and autoregressive decoding to generates candidate RNA sequences conditioned on one or more 3D backbone structures where the identities of the bases are unknown. On a single-state fixed backbone re-design benchmark of 14 RNA structures from the PDB identified by Das et al. (2010), gRNAde obtains higher native sequence recovery rates (56% on average) compared to Rosetta (45% on average), taking under a second to produce designs compared to the reported hours for Rosetta. We further demonstrate the utility of gRNAde on a new benchmark of multi-state design for structurally flexible RNAs, as well as zero-shot ranking of mutational fitness landscapes in a retrospective analysis of a recent ribozyme. Experimental wet lab validation on 10 different structured RNA backbones finds that gRNAde has a success rate of 50% at designing pseudoknotted RNA structures, a significant advance over 35% for Rosetta. Open source code and tutorials are available at: https://github.com/chaitjo/geometric-rna-design

gRNAde: Geometric Deep Learning for 3D RNA inverse design

TL;DR

gRNAde tackles 3D RNA inverse design by modeling conformational ensembles with a multi-state SE(3)-equivariant GNN and autoregressive decoding to generate backbone-conditioned sequences. It introduces a geometric multi-graph representation, state-wise GNN encoding, and state-invariant pooling to produce sequences that respect 3D structure and dynamics. The approach yields higher native sequence recovery and much faster inference than Rosetta on single-state benchmarks, and extends to multi-state design with improved performance in flexible regions, plus zero-shot ranking of fitness landscapes. Wet-lab validation via OpenKnot Round 6 shows competitive OpenKnot scores and a higher success rate than Rosetta, underscoring practical utility and generalizability to real experimental workflows.

Abstract

Computational RNA design tasks are often posed as inverse problems, where sequences are designed based on adopting a single desired secondary structure without considering 3D conformational diversity. We introduce gRNAde, a geometric RNA design pipeline operating on 3D RNA backbones to design sequences that explicitly account for structure and dynamics. gRNAde uses a multi-state Graph Neural Network and autoregressive decoding to generates candidate RNA sequences conditioned on one or more 3D backbone structures where the identities of the bases are unknown. On a single-state fixed backbone re-design benchmark of 14 RNA structures from the PDB identified by Das et al. (2010), gRNAde obtains higher native sequence recovery rates (56% on average) compared to Rosetta (45% on average), taking under a second to produce designs compared to the reported hours for Rosetta. We further demonstrate the utility of gRNAde on a new benchmark of multi-state design for structurally flexible RNAs, as well as zero-shot ranking of mutational fitness landscapes in a retrospective analysis of a recent ribozyme. Experimental wet lab validation on 10 different structured RNA backbones finds that gRNAde has a success rate of 50% at designing pseudoknotted RNA structures, a significant advance over 35% for Rosetta. Open source code and tutorials are available at: https://github.com/chaitjo/geometric-rna-design
Paper Structure (17 sections, 2 equations, 13 figures, 2 tables)

This paper contains 17 sections, 2 equations, 13 figures, 2 tables.

Figures (13)

  • Figure 1: The gRNAde pipeline for 3D RNA inverse design. gRNAde is a generative model for RNA sequence design conditioned on backbone 3D structure(s). gRNAde processes one or more RNA backbone graphs (a conformational ensemble) via a multi-state GNN encoder which is equivariant to 3D roto-translation of coordinates as well as conformational state order, followed by conformational state order-invariant pooling and autoregressive sequence decoding.
  • Figure 2: gRNAde featurizes RNA backbone structures as 3D geometric graphs. Each RNA nucleotide is a node in the graph, consisting of 3 coarse-grained beads for the coordinates for P, C4', N1 (pyrimidines) or N9 (purines) which are used to compute initial geometric features and edges to nearest neighbours in 3D space. Backbone chain figure adapted from ingraham2019generative.
  • Figure 3: In-silico evaluation metrics for gRNAde designed sequences. We consider (1) sequence recovery, the percentage of native nucleotides recovered in designed samples, (2) self-consistency scores, which are measured by 'forward folding' designed sequences using a structure predictor and measuring how well 2D and 3D structure are recovered (we use EternaFold and RhoFold for 2D/3D structure prediction, respectively). We also report (3) perplexity, the model's estimate of the likelihood of a sequence given a backbone.
  • Figure 4: gRNAde compared to Rosetta for single-state design. (a) We benchmark native sequence recovery of gRNAde, RDesign, Rosetta, FARNA and ViennaRNA on 14 RNA structures of interest identified by das2010atomic. gRNAde obtains higher native sequence recovery rates (56% on average) compared to Rosetta (45%) and all other methods. (b) Sequence recovery per sample for Rosetta and gRNAde, shaded by gRNAde’s perplexity for each sample. gRNAde's perplexity is correlated with native sequence recovery for designed sequences (Pearson correlation: -0.76, Spearman correlation: -0.67). Full results on single-state test set are available in \ref{['app:ablation']} and per-RNA results in Appendix \ref{['tab:singlestate']}.
  • Figure 5: Multi-state design benchmark. (a) Multi-state gRNAde shows a consistent 3-5% improvement over the single-state variant in terms of sequence recovery on the multi-state test set of 100 RNAs, with the best performance obtained using 3 states. (b) When plotting sequence recovery per-nucleotide, multi-state gRNAde improves over a single-state model for structurally flexible regions of RNAs, as characterised by nucleotides that tend to undergo changes in base pairing (left) and nucleotides with higher average RMSD across multiple states (right). Marginal histograms in blue show the distribution of values. We plot performance for one consistent random seed across all models; collated results and ablations are available in \ref{['app:ablation']}.
  • ...and 8 more figures