3D-based RNA function prediction tools in rnaglib
Carlos Oliver, Vincent Mallet, Jérôme Waldispühl
TL;DR
RNA 3D structural data are expanding, enabling data-driven discovery of structure–function relationships, but standardized representations and datasets remain challenging. The chapter presents rnaglib, a Python toolkit that encodes RNA 3D structures as expressive graphs using Leontis-Westhof base-pair geometry, provides dataset construction utilities, and supports self-supervised and supervised learning workflows. It introduces RNADataset and multiple Representations (graph, point cloud, voxel) with a training loop to predict functional attributes such as binding residues, enabling end-to-end ML pipelines. The work lowers barriers to geometric deep learning on RNA and supports design and discovery by linking 3D structure to function in a scalable, extensible framework.
Abstract
Understanding the connection between complex structural features of RNA and biological function is a fundamental challenge in evolutionary studies and in RNA design. However, building datasets of RNA 3D structures and making appropriate modeling choices remains time-consuming and lacks standardization. In this chapter, we describe the use of rnaglib, to train supervised and unsupervised machine learning-based function prediction models on datasets of RNA 3D structures.
