Table of Contents
Fetching ...

Endhered patterns in matchings and RNA

Célia Biane, Greg Hampikian, Sergey Kirgizov, Khaydar Nurligareev

TL;DR

The paper defines endhered patterns in matchings and analyzes their distribution for size-2 and size-3 patterns, establishing equidistribution under endhered twists and deriving recurrences, exponential generating functions, and asymptotics that include a Poisson limit for size-2 patterns. It then compares these theoretical results to native RNA secondary structures with pseudoknots, showing that such patterns are relatively rare in real data and that reducing to RNA shapes clarifies pattern presence. The work highlights a notable gap between unrestricted combinatorial models and empirical RNA patterns, suggesting the need for pattern-based constraints to better capture RNA folding and pseudoknot biology, and outlines directions for extending the approach to more complex patterns and datasets.

Abstract

An endhered (end-adhered) pattern is a subset of arcs in matchings, such that the corresponding starting points are consecutive and the same holds for the ending points. Such patterns are in one-to-one correspondence with the permutations. We focus on the occurrence frequency of such patterns in matchings and native (real-world) RNA structures with pseudoknots. We present combinatorial results related to the distribution and asymptotic behavior of the pattern 21, which corresponds to two consecutive base pairs frequently encountered in RNA, and the pattern 12, representing the archetypal minimal pseudoknot. We show that in matchings these two patterns are equidistributed, which is quite different from what we can find in native RNAs. We also examine the distribution of endhered patterns of size 3, showing how the patterns change under the transformation called endhered twist. Finally, we compute the distributions of endhered patterns of size 2 and 3 in native secondary RNA structures with pseudoknots and discuss possible outcomes of our study.

Endhered patterns in matchings and RNA

TL;DR

The paper defines endhered patterns in matchings and analyzes their distribution for size-2 and size-3 patterns, establishing equidistribution under endhered twists and deriving recurrences, exponential generating functions, and asymptotics that include a Poisson limit for size-2 patterns. It then compares these theoretical results to native RNA secondary structures with pseudoknots, showing that such patterns are relatively rare in real data and that reducing to RNA shapes clarifies pattern presence. The work highlights a notable gap between unrestricted combinatorial models and empirical RNA patterns, suggesting the need for pattern-based constraints to better capture RNA folding and pseudoknot biology, and outlines directions for extending the approach to more complex patterns and datasets.

Abstract

An endhered (end-adhered) pattern is a subset of arcs in matchings, such that the corresponding starting points are consecutive and the same holds for the ending points. Such patterns are in one-to-one correspondence with the permutations. We focus on the occurrence frequency of such patterns in matchings and native (real-world) RNA structures with pseudoknots. We present combinatorial results related to the distribution and asymptotic behavior of the pattern 21, which corresponds to two consecutive base pairs frequently encountered in RNA, and the pattern 12, representing the archetypal minimal pseudoknot. We show that in matchings these two patterns are equidistributed, which is quite different from what we can find in native RNAs. We also examine the distribution of endhered patterns of size 3, showing how the patterns change under the transformation called endhered twist. Finally, we compute the distributions of endhered patterns of size 2 and 3 in native secondary RNA structures with pseudoknots and discuss possible outcomes of our study.
Paper Structure (11 sections, 13 theorems, 43 equations, 12 figures, 4 tables)

This paper contains 11 sections, 13 theorems, 43 equations, 12 figures, 4 tables.

Key Result

Lemma 2.4

Two endhered patterns $\pi$ and $\tau$ of the same size have the same joint distribution if they are identical under the left or right endhered twists. In other words, if $\pi = \operatorname{\mathtt{letw}}(\tau)$ or $\pi = \operatorname{\mathtt{retw}}(\tau)$, then for any integers $n$, $k$ and $m$. In particular, $a_{n,k}(\pi) = a_{n,k}(\tau)$.

Figures (12)

  • Figure 1: A drawing (a), an arc diagram (b), an extended dot bracket notation (c), and a set of pairs (d) representing an example of an RNA secondary structure.
  • Figure 2: An example of matching construction, corresponding permutations (a), and a schema of recursive construction of matchings (b).
  • Figure 3: Endhered pattern 231 (a) and an example (b) of its occurrence.
  • Figure 4: An example of the right endhered twist, runs of right points are underlined.
  • Figure 5: Geometrical meaning of endhered twists.
  • ...and 7 more figures

Theorems & Definitions (31)

  • Definition 2.1
  • Definition 2.2
  • Lemma 2.4
  • proof
  • Remark 2.5
  • Corollary 2.6
  • Remark 2.7
  • Theorem 2.8
  • proof
  • Lemma 2.9
  • ...and 21 more