Table of Contents
Fetching ...

A Simple and Efficient Algorithm for Sorting Signed Permutations by Reversals

Krister M. Swenson

TL;DR

The paper tackles the problem of sorting a signed permutation into the identity using a minimum-length sequence of reversals, a foundational task in computational genomics. It reinterprets the problem through the overlap graph and a recovery framework for unsafe reversals, linking reversals to local graph complementations and using a balanced BST (splay-tree) representation to efficiently manage good pairs. The main contribution is an $O(n \log n)$ worst-case algorithm that is lightweight to implement and relies on a recoverable sequence of good reversals, supported by a specialized data structure and a backtracking recovery mechanism. This work narrows the gap toward the conjectured lower bound and enhances practical feasibility for large-scale genome rearrangement studies.

Abstract

In 1937, biologists Sturtevant and Tan posed a computational question: transform a chromosome represented by a permutation of genes, into a second permutation, using a minimum-length sequence of reversals, each inverting the order of a contiguous subset of elements. Solutions to this problem, applied to Drosophila chromosomes, were computed by hand. The first algorithmic result was a heuristic that was published in 1982. In the 1990s a more biologically relevant version of the problem, where the elements have signs that are also inverted by a reversal, finally received serious attention by the computer science community. This effort eventually resulted in the first polynomial time algorithm for Signed Sorting by Reversals. Since then, a dozen more articles have been dedicated to simplifying the theory and developing algorithms with improved running times. The current best algorithm, which runs in $O(n \log^2 n / \log\log n)$ time, fails to meet what some consider to be the likely lower bound of $O(n \log n)$. In this article, we present the first algorithm that runs in $O(n \log n)$ time in the worst case. The algorithm is fairly simple to implement, and the running time hides very low constants.

A Simple and Efficient Algorithm for Sorting Signed Permutations by Reversals

TL;DR

The paper tackles the problem of sorting a signed permutation into the identity using a minimum-length sequence of reversals, a foundational task in computational genomics. It reinterprets the problem through the overlap graph and a recovery framework for unsafe reversals, linking reversals to local graph complementations and using a balanced BST (splay-tree) representation to efficiently manage good pairs. The main contribution is an worst-case algorithm that is lightweight to implement and relies on a recoverable sequence of good reversals, supported by a specialized data structure and a backtracking recovery mechanism. This work narrows the gap toward the conjectured lower bound and enhances practical feasibility for large-scale genome rearrangement studies.

Abstract

In 1937, biologists Sturtevant and Tan posed a computational question: transform a chromosome represented by a permutation of genes, into a second permutation, using a minimum-length sequence of reversals, each inverting the order of a contiguous subset of elements. Solutions to this problem, applied to Drosophila chromosomes, were computed by hand. The first algorithmic result was a heuristic that was published in 1982. In the 1990s a more biologically relevant version of the problem, where the elements have signs that are also inverted by a reversal, finally received serious attention by the computer science community. This effort eventually resulted in the first polynomial time algorithm for Signed Sorting by Reversals. Since then, a dozen more articles have been dedicated to simplifying the theory and developing algorithms with improved running times. The current best algorithm, which runs in time, fails to meet what some consider to be the likely lower bound of . In this article, we present the first algorithm that runs in time in the worst case. The algorithm is fairly simple to implement, and the running time hides very low constants.
Paper Structure (14 sections, 11 theorems, 5 equations, 6 figures, 2 algorithms)

This paper contains 14 sections, 11 theorems, 5 equations, 6 figures, 2 algorithms.

Key Result

Theorem 1

A reversal sequence $S$ transforming a permutation into the identity is of minimum length if $S$ contains only good reversals.

Figures (6)

  • Figure 1: A reversal sequence transforming $(0~\raisebox{.6mm}{\tiny\boldmath$-$}2~3~1~4)$ into the identity permutation. At the top, $(0~\raisebox{.6mm}{\tiny\boldmath$-$}2~3~1~4)$ is transformed into $(0~\raisebox{.6mm}{\tiny\boldmath$-$}2~\raisebox{.6mm}{\tiny\boldmath$-$}1~\raisebox{.6mm}{\tiny\boldmath$-$}3~4)$ by the first reversal $\rho(2,3) = \mu(\raisebox{.6mm}{\tiny\boldmath$-$}2,1)$. The pairs of points for the elements of a permutation appear in a line above the permutation, and an interval for each identity pair is indicated above those points by a black line; lines are solid for good pairs and dashed for bad pairs. The overlap graph for each permutation appears below it; good vertices are colored black and bad are colored white. The sequence of reversals on permutations can be represented by the sequence of vertex complementations indicated by arrows, yielding a graph with only isolated white vertices.
  • Figure 2: The upper permutation $\pi = (0~\raisebox{.6mm}{\tiny\boldmath$-$}5~\raisebox{.6mm}{\tiny\boldmath$-$}2~\raisebox{.6mm}{\tiny\boldmath$-$}4~\raisebox{.6mm}{\tiny\boldmath$-$}7~\raisebox{.6mm}{\tiny\boldmath$-$}9~\raisebox{.6mm}{\tiny\boldmath$-$}10~\raisebox{.6mm}{\tiny\boldmath$-$}6~1~3~8~11)$ is transformed into $\pi' = \pi\rho(5, 10) = (0~\raisebox{.6mm}{\tiny\boldmath$-$}5~\raisebox{.6mm}{\tiny\boldmath$-$}2~\raisebox{.6mm}{\tiny\boldmath$-$}4~\raisebox{.6mm}{\tiny\boldmath$-$}3~\raisebox{.6mm}{\tiny\boldmath$-$}1~6~10~9~7~8~11)$ through the unsafe reversal $\rho(5, 10) = \mu(3, \raisebox{.6mm}{\tiny\boldmath$-$}4) = \mu(\raisebox{.6mm}{\tiny\boldmath$-$}7, 8)$. The overlap graph for $\pi$ is on the bottom-left, and the overlap graph for $\pi'$ is on the bottom-right. Vertices from the bad components of $OV(\pi')$ are labeled by $b$ and vertices from the good components are labeled by $g$. See the caption of Figure \ref{['fig:example']} for a more detailed description.
  • Figure 3: On the left, a graph $H$ has a single connected component, before being split into graph $H/v$ having bad components $B_1, B_2, \ldots, B_{\ell_{1}}$ and good components $G_1, G_2, \ldots, G_{\ell_{2}}$, on the right. Good vertices are filled in black. Note that, due to the definition of vertex complementation, all vertices from $B = \cup_{i=1}^{\ell_{1}} B_i$ that are adjacent to $v$ must be good, while the others from $B$ must be bad. If, instead of $v$, a good vertex $u$ from $B$ is applied to $H$, then any good vertex in $H/u[B]$ will be adjacent to all the vertices of $X$.
  • Figure 4: A splay tree a) before, and b) after the reversal $\mu(\raisebox{.6mm}{\tiny\boldmath$-$}7, 8)$ of Figure \ref{['fig:bigexample']}. Nodes are labeled by the values indicated in the bottom-left corner of the figure.
  • Figure 5: A rotation promotes a node above its parent, while keeping the in-order traversal intact.
  • ...and 1 more figures

Theorems & Definitions (24)

  • Theorem 1: hannenhalliTransformingCabbageTurnip1999
  • Definition
  • Theorem 2: tannierAdvancesSortingReversals2007
  • Remark 1
  • Remark 2
  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Corollary 1
  • ...and 14 more