RiboGen: RNA Sequence and Structure Co-Generation with Equivariant MultiFlow
Dana Rubin, Allan dos Santos Costa, Manvitha Ponnapati, Joseph Jacobson
TL;DR
The paper addresses the challenge of jointly generating RNA sequence and all-atom 3D structure. It introduces RiboGen, a Multiflow framework that combines continuous Flow Matching for coordinates with discrete Flow Matching for sequences, implemented via Euclidean-Equivariant networks to model 3D geometry. Empirical results show chemical-valid backbone and base geometry, with competitive self-consistency (TM-score) across a range of sequence lengths and efficient one-shot co-generation compared to prior methods. The work highlights the potential of sequence-structure co-generation to accelerate RNA design and optimization tasks.
Abstract
Ribonucleic acid (RNA) plays fundamental roles in biological systems, from carrying genetic information to performing enzymatic function. Understanding and designing RNA can enable novel therapeutic application and biotechnological innovation. To enhance RNA design, in this paper we introduce RiboGen, the first deep learning model to simultaneously generate RNA sequence and all-atom 3D structure. RiboGen leverages the standard Flow Matching with Discrete Flow Matching in a multimodal data representation. RiboGen is based on Euclidean Equivariant neural networks for efficiently processing and learning three-dimensional geometry. Our experiments show that RiboGen can efficiently generate chemically plausible and self-consistent RNA samples, suggesting that co-generation of sequence and structure is a competitive approach for modeling RNA.
