Kirigami: large convolutional kernels improve deep learning-based RNA secondary structure prediction
Marc Harary, Chengxin Zhang
TL;DR
Kirigami tackles RNA secondary structure prediction by reframing base-pairing as a graph adjacency problem and applying a large-kernel fully convolutional network ($k=11$) with no pooling to capture long-range dependencies. A series of post-processing constraints—including symmetry, canonical pair enforcement via prime-encoding, and multiplet elimination—produce physically plausible structures, enabling pseudoknots. On the bpRNA-based TS0 benchmark of $1{,}305$ structures, Kirigami achieves a mean MCC of $0.706$ (notably higher than SOTA methods) and $0.615$ on pseudoknots, highlighting the benefit of large receptive fields for RNA topology. These results suggest neural approaches can surpass traditional thermodynamic models in many cases, while underscoring data scarcity as a key limitation and pointing toward future directions like attention mechanisms and expanded experimental structure datasets.
Abstract
We introduce a novel fully convolutional neural network (FCN) architecture for predicting the secondary structure of ribonucleic acid (RNA) molecules. Interpreting RNA structures as weighted graphs, we employ deep learning to estimate the probability of base pairing between nucleotide residues. Unique to our model are its massive 11-pixel kernels, which we argue provide a distinct advantage for FCNs on the specialized domain of RNA secondary structures. On a widely adopted, standardized test set comprised of 1,305 molecules, the accuracy of our method exceeds that of current state-of-the-art (SOTA) secondary structure prediction software, achieving a Matthews Correlation Coefficient (MCC) over 11-40% higher than that of other leading methods on overall structures and 58-400% higher on pseudoknots specifically.
