Table of Contents
Fetching ...

Harnessing Hypergraphs in Geometric Deep Learning for 3D RNA Inverse Folding

Guang Yang, Lei Fan

TL;DR

This work tackles designing RNA sequences that realize target 3D structures by introducing HyperRNA, a hypergraph-based encoder–decoder framework. It preprocesses RNA–protein backbones with a 3-bead coarse-grained representation, applies a self-attention–augmented HGNN to capture high-order interactions, and autoregressively decodes sequences using geometric vector perceptrons. Training combines sequence and structure losses, while inference uses flow-based trajectory refinement to account for RNA flexibility. Across PDBBind and RNAsolo benchmarks, HyperRNA demonstrates superior structural accuracy and diverse, valid sequence design, highlighting the value of hypergraphs in RNA design. The results underscore a promising direction for RNA engineering that better accounts for multi-way nucleotide interactions and 3D geometry.

Abstract

The RNA inverse folding problem, a key challenge in RNA design, involves identifying nucleotide sequences that can fold into desired secondary structures, which are critical for ensuring molecular stability and function. The inherent complexity of this task stems from the intricate relationship between sequence and structure, making it particularly challenging. In this paper, we propose a framework, named HyperRNA, a generative model with an encoder-decoder architecture that leverages hypergraphs to design RNA sequences. Specifically, our HyperRNA model consists of three main components: preprocessing, encoding and decoding. In the preprocessing stage, graph structures are constructed by extracting the atom coordinates of RNA backbone based on 3-bead coarse-grained representation. The encoding stage processes these graphs, capturing higher order dependencies and complex biomolecular interactions using an attention embedding module and a hypergraph-based encoder. Finally, the decoding stage generates the RNA sequence in an autoregressive manner. We conducted quantitative and qualitative experiments on the PDBBind and RNAsolo datasets to evaluate the inverse folding task for RNA sequence generation and RNA-protein complex sequence generation. The experimental results demonstrate that HyperRNA not only outperforms existing RNA design methods but also highlights the potential of leveraging hypergraphs in RNA engineering.

Harnessing Hypergraphs in Geometric Deep Learning for 3D RNA Inverse Folding

TL;DR

This work tackles designing RNA sequences that realize target 3D structures by introducing HyperRNA, a hypergraph-based encoder–decoder framework. It preprocesses RNA–protein backbones with a 3-bead coarse-grained representation, applies a self-attention–augmented HGNN to capture high-order interactions, and autoregressively decodes sequences using geometric vector perceptrons. Training combines sequence and structure losses, while inference uses flow-based trajectory refinement to account for RNA flexibility. Across PDBBind and RNAsolo benchmarks, HyperRNA demonstrates superior structural accuracy and diverse, valid sequence design, highlighting the value of hypergraphs in RNA design. The results underscore a promising direction for RNA engineering that better accounts for multi-way nucleotide interactions and 3D geometry.

Abstract

The RNA inverse folding problem, a key challenge in RNA design, involves identifying nucleotide sequences that can fold into desired secondary structures, which are critical for ensuring molecular stability and function. The inherent complexity of this task stems from the intricate relationship between sequence and structure, making it particularly challenging. In this paper, we propose a framework, named HyperRNA, a generative model with an encoder-decoder architecture that leverages hypergraphs to design RNA sequences. Specifically, our HyperRNA model consists of three main components: preprocessing, encoding and decoding. In the preprocessing stage, graph structures are constructed by extracting the atom coordinates of RNA backbone based on 3-bead coarse-grained representation. The encoding stage processes these graphs, capturing higher order dependencies and complex biomolecular interactions using an attention embedding module and a hypergraph-based encoder. Finally, the decoding stage generates the RNA sequence in an autoregressive manner. We conducted quantitative and qualitative experiments on the PDBBind and RNAsolo datasets to evaluate the inverse folding task for RNA sequence generation and RNA-protein complex sequence generation. The experimental results demonstrate that HyperRNA not only outperforms existing RNA design methods but also highlights the potential of leveraging hypergraphs in RNA engineering.

Paper Structure

This paper contains 16 sections, 9 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Our framework is divided into three steps: preprocessing the RNA backbone, encoding graph features and generating RNA sequences.
  • Figure 2: HyperRNA model architecture. It consists of three steps: preprocessing, encoding and decoding. First, atom coordinates of the RNA backbone are extracted using 3-bead coarse-grained representation to construct graph structures. These graphs are then embedded using the attention embedding module and processed by a hypergraph-based encoder. Finally, the hypergraph-based decoder utilizing the autoregressive model generates RNA sequences.
  • Figure 3: The architecture of Self-Attention mechanism. $Q,K$ and $V$ are calculated by using the graph vector feature with their corresponding learnable weight metrics $\mathbf{w}_q$, $\mathbf{w}_k$ and $\mathbf{w}_v$.
  • Figure 4: Visualizations of RNA structures and sequences generated by HyperRNA. Top: The crystal structure of Rev (PDB ID: 4PMI). Bottom: Design of RNA for composing the crystal structure of the Drosophila melanogaster SNF (PDB ID: 6F4G).
  • Figure 5: Ablation Studies of our HyperRNA on RNA sequence recovery. The top side represents the result on the sequence-similar split, while the bottom side shows the result on the RF2NA split. "AP" refers to attention pooling, while "SA" refers to self-attention embedding.