Table of Contents
Fetching ...

BeeRNA: tertiary structure-based RNA inverse folding using Artificial Bee Colony

Mehyar Mlaweh, Tristan Cazenave, Ines Alaya

TL;DR

BeeRNA introduces a training-free, Artificial Bee Colony–based framework for designing RNA sequences that fold into predefined tertiary structures, using RhoFold for structure prediction and a two-stage fitness that blends base-pair distance filtering with RMSD minimization. The method enforces thermodynamic stability and GC-content constraints, excelling in short- to medium-length RNAs (<100 nt) with practical CPU runtimes. Across RNASolo, RFAM, and DasDataset benchmarks, BeeRNA achieves competitive or superior structural fidelity compared to state-of-the-art deep learning methods, while avoiding training data requirements. This work demonstrates the viability of bio-inspired search for practical RNA 3D design and lays groundwork for hybrid approaches integrating newer predictors and multiobjective objectives.

Abstract

The Ribonucleic Acid (RNA) inverse folding problem, designing nucleotide sequences that fold into specific tertiary structures, is a fundamental computational biology problem with important applications in synthetic biology and bioengineering. The design of complex three-dimensional RNA architectures remains computationally demanding and mostly unresolved, as most existing approaches focus on secondary structures. In order to address tertiary RNA inverse folding, we present BeeRNA, a bio-inspired method that employs the Artificial Bee Colony (ABC) optimization algorithm. Our approach combines base-pair distance filtering with RMSD-based structural assessment using RhoFold for structure prediction, resulting in a two-stage fitness evaluation strategy. To guarantee biologically plausible sequences with balanced GC content, the algorithm takes thermodynamic constraints and adaptive mutation rates into consideration. In this work, we focus primarily on short and medium-length RNAs ($<$ 100 nucleotides), a biologically significant regime that includes microRNAs (miRNAs), aptamers, and ribozymes, where BeeRNA achieves high structural fidelity with practical CPU runtimes. The lightweight, training-free implementation will be publicly released for reproducibility, offering a promising bio-inspired approach for RNA design in therapeutics and biotechnology.

BeeRNA: tertiary structure-based RNA inverse folding using Artificial Bee Colony

TL;DR

BeeRNA introduces a training-free, Artificial Bee Colony–based framework for designing RNA sequences that fold into predefined tertiary structures, using RhoFold for structure prediction and a two-stage fitness that blends base-pair distance filtering with RMSD minimization. The method enforces thermodynamic stability and GC-content constraints, excelling in short- to medium-length RNAs (<100 nt) with practical CPU runtimes. Across RNASolo, RFAM, and DasDataset benchmarks, BeeRNA achieves competitive or superior structural fidelity compared to state-of-the-art deep learning methods, while avoiding training data requirements. This work demonstrates the viability of bio-inspired search for practical RNA 3D design and lays groundwork for hybrid approaches integrating newer predictors and multiobjective objectives.

Abstract

The Ribonucleic Acid (RNA) inverse folding problem, designing nucleotide sequences that fold into specific tertiary structures, is a fundamental computational biology problem with important applications in synthetic biology and bioengineering. The design of complex three-dimensional RNA architectures remains computationally demanding and mostly unresolved, as most existing approaches focus on secondary structures. In order to address tertiary RNA inverse folding, we present BeeRNA, a bio-inspired method that employs the Artificial Bee Colony (ABC) optimization algorithm. Our approach combines base-pair distance filtering with RMSD-based structural assessment using RhoFold for structure prediction, resulting in a two-stage fitness evaluation strategy. To guarantee biologically plausible sequences with balanced GC content, the algorithm takes thermodynamic constraints and adaptive mutation rates into consideration. In this work, we focus primarily on short and medium-length RNAs ( 100 nucleotides), a biologically significant regime that includes microRNAs (miRNAs), aptamers, and ribozymes, where BeeRNA achieves high structural fidelity with practical CPU runtimes. The lightweight, training-free implementation will be publicly released for reproducibility, offering a promising bio-inspired approach for RNA design in therapeutics and biotechnology.

Paper Structure

This paper contains 23 sections, 6 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: BeeRNA optimization process for a sample RNA structure. The flowchart illustrates the step-by-step progression of BeeRNA, starting from the initial RNA 3D structure to the final predicted sequence, using the ABC that employs RMSD calculation as the fitness function, comparing the predicted structure generated by RhoFold with the initial structure. A preliminary filter of candidate sequences is applied using base-pair distance to enhance efficiency.
  • Figure 2: Predicted structure (orange) vs. sequence structure (blue) for 2OUE RNA. Despite 98.4% native recovery, RMSD and GDT-TS show a poor structural match.
  • Figure 3: RMSD vs. sequence length for BeeRNA and gRNAde on RNASolo. Points represent RMSD values (Å) for individual sequences, plotted against sequence length (3–30 nucleotides). Regression lines show RMSD trends for each method.
  • Figure 4: RMSD vs. sequence length for BeeRNA and gRNAde on RFAM. Shaded region marks sequences over 100 nucleotides