An efficient algorithm to compute the minimum free energy of interacting nucleic acid strands
Ahmed Shalaby, Damien Woods
TL;DR
The paper addresses predicting the Minimum Free Energy for connected unpseudoknotted multi-strand nucleic acid structures under rotational symmetry, an open problem for MFE even as partition-function methods were known for a constant number of strands. It introduces a two-pronged approach: (i) extend the symmetry-naive DP of Dirks et al. to compute a baseline MFE, and (ii) apply a backtracking procedure that exploits a polynomial bound on the number of symmetric configurations via pizza cuts and a central loop to identify the true MFE while accounting for the $k_B T\log R$ symmetry penalty. The main result is a first polynomial-time MFE algorithm for O(1) strands with symmetry, running in $O(N^4(c-1)!)$ time and $O(N^4)$ space (with a $O(N^4\log N(c-1)!)$ time, $O(N^3)$ space variant), matching the partition-function algorithm in asymptotics up to constants and providing a path toward efficient multi-stranded MFE computation despite underlying NP-hardness for growing strand counts. Key technical contributions include a linear bound on the number of symmetric backbone cuts, a rigorous “pizza slice” decomposition to manage rotational symmetry, and a backtracking framework that constructs an asymmetric true-MFE structure within a bounded energy window. These ideas yield a symmetry-aware, practical MFE solver for small-to-moderate numbers of interacting strands, with potential extensions to larger systems and connections to partition-function analyses.
Abstract
The information-encoding molecules RNA and DNA form a combinatorially large set of secondary structures through nucleic acid base pairing. Thermodynamic prediction algorithms predict favoured, or minimum free energy (MFE), secondary structures, and can assign an equilibrium probability to any structure via the partition function: a Boltzman-weighted sum over the set of secondary structures. MFE is NP-hard in the presence pseudoknots, base pairings that violate a restricted planarity condition. However, unpseudoknotted structures are amenable to dynamic programming: for a single DNA/RNA strand there are polynomial time algorithms for MFE and partition function. For multiple strands, the problem is more complicated due to entropic penalties. Dirks et al [SICOMP Review; 2007] showed that for O(1) strands, with N bases, there is a polynomial time in N partition function algorithm, however their technique did not generalise to MFE which they left open. We give the first polynomial time (O(N^4)) algorithm for unpseudoknotted multiple (O(1)) strand MFE, answering the open problem from Dirks et al. The challenge lies in considering rotational symmetry of secondary structures, a feature not immediately amenable to dynamic programming algorithms. Our proof has two main technical contributions: First, a polynomial upper bound on the number of symmetric secondary structures to be considered when computing rotational symmetry penalties. Second, that bound is leveraged by a backtracking algorithm to find the MFE in an exponential space of contenders. Our MFE algorithm has the same asymptotic run time as Dirks et al's partition function algorithm, suggesting efficient handling of rotational symmetry, although higher space complexity. It also seems reasonably tight in the number of strands since Codon, Hajiaghayi & Thachuk [DNA27, 2021] have shown that unpseudoknotted MFE is NP-hard for O(N) strands.
