Path Signatures Enable Model-Free Mapping of RNA Modifications

Maud Lemercier; Paola Arrubarrena; Salvatore Di Giorgio; Julia Brettschneider; Thomas Cass; Valerie Griesche; Isabel S. Naarmann-de Vries; Anastasia Papavasiliou; Alessia Ruggieri; Irem Tellioglu; Chia Ching Wu; F. Nina Papavasiliou; Terry Lyons

Path Signatures Enable Model-Free Mapping of RNA Modifications

Maud Lemercier, Paola Arrubarrena, Salvatore Di Giorgio, Julia Brettschneider, Thomas Cass, Valerie Griesche, Isabel S. Naarmann-de Vries, Anastasia Papavasiliou, Alessia Ruggieri, Irem Tellioglu, Chia Ching Wu, F. Nina Papavasiliou, Terry Lyons

TL;DR

This work introduces a model-free computational method that reframes modification detection as an anomaly detection problem, requiring only canonical (unmodified) RNA reads without any other annotated data, and applies this framework to dengue virus transcripts and mammalian mRNAs.

Abstract

Detecting chemical modifications on RNA molecules remains a key challenge in epitranscriptomics. Traditional reverse transcription-based sequencing methods introduce enzyme- and sequence-dependent biases and fragment RNA molecules, confounding the accurate mapping of modifications across the transcriptome. Nanopore direct RNA sequencing offers a powerful alternative by preserving native RNA molecules, enabling the detection of modifications at single-molecule resolution. However, current computational tools can identify only a limited subset of modification types within well-characterized sequence contexts for which ample training data exists. Here, we introduce a model-free computational method that reframes modification detection as an anomaly detection problem, requiring only canonical (unmodified) RNA reads without any other annotated data. For each nanopore read, our approach extracts robust, modification-sensitive features from the raw ionic current signal at a site using the signature transform, then computes an anomaly score by comparing the resulting feature vector to its nearest neighbors in an unmodified reference dataset. We convert anomaly scores into statistical p-values to enable anomaly detection at both individual read and site levels. Validation on densely-modified \textit{E. coli} rRNA demonstrates that our approach detects known sites harboring diverse modification types, without prior training on these modifications. We further applyied this framework to dengue virus (DENV) transcripts and mammalian mRNAs. For DENV sfRNA, it led to revealing a novel 2'-O-methylated site, which we validate orthogonally by qRT-PCR assays. These results demonstrate that our model-free approach operates robustly across different types of RNAs and datasets generated with different nanopore sequencing chemistries.

Path Signatures Enable Model-Free Mapping of RNA Modifications

TL;DR

Abstract

Path Signatures Enable Model-Free Mapping of RNA Modifications

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (11)