Table of Contents
Fetching ...

Complexity and algorithms for Swap median and relation to other consensus problems

Luís Cunha, Thiago Lopes, Arnaud Mary

TL;DR

This work resolves the long-standing open problem of the computational complexity of the Swap Median problem for three input permutations by proving NP-completeness, and it also establishes NP-completeness for the Swap Closest problem under the same input restriction. The authors introduce a graph-theoretic framework based on algebraic cycles and 2-circles-intersection graphs, reducing the median/closest questions to Maximum Independent Set on these graphs, with reductions that leverage 2-subdivision graphs. They further show APX-hardness and inapproximability results, demonstrating that no PTAS exists for these problems unless P=NP. Collectively, the results close the complexity gap between the polynomially solvable two-input case and the NP-hard multi-input setting, and they hint at connections to Rank Aggregation with three inputs and to broader convexity notions in permutation spaces.

Abstract

Genome rearrangements are events in which large blocks of DNA exchange pieces during evolution. The analysis of such events is a tool for understanding evolutionary genomics, based on finding the minimum number of rearrangements to transform one genome into another, which can be modeled as permutations of integers. In a general scenario, more than two genomes are considered, and new challenges arise. Given three input permutations, the Median problem consists of finding a permutation s that minimizes the sum of the distances between s and each of the three input permutations, according to a specified distance measure. We prove that Median problem over swap distances is NP-complete, a problem whose computational complexity has remained unsolved for nearly 20 years (Eriksen, Theor. Comput. Sci., 2007). To tackle this problem, we introduce a graph-based perspective by the class called 2-circles-intersection graphs. We show that for each 2-circles-intersection graph G, we can associate three permutations such that G has a large independent set iff the median of the three associated permutations reaches a specific lower bound. We then prove that maximum independent set is NP-complete in this graph class. By this approach, we also establish that the Closest problem which aims to minimize the maximum distance between the solution and the input permutations is NP-complete even with three input permutations. This last result closes the complexity gap in the dichotomy between P and NP-complete cases: with two input permutations, the problem is easily solvable, while for an arbitrary number of input permutations, the Closest problem was known to be NP-hard since 2007 (Popov, Theor. Comput. Sci., 2007). Additionally, we show that both the Swap Median and Swap Closest problems are APX-hard, further emphasizing the computational complexity of these genome-related problems through graph theory.

Complexity and algorithms for Swap median and relation to other consensus problems

TL;DR

This work resolves the long-standing open problem of the computational complexity of the Swap Median problem for three input permutations by proving NP-completeness, and it also establishes NP-completeness for the Swap Closest problem under the same input restriction. The authors introduce a graph-theoretic framework based on algebraic cycles and 2-circles-intersection graphs, reducing the median/closest questions to Maximum Independent Set on these graphs, with reductions that leverage 2-subdivision graphs. They further show APX-hardness and inapproximability results, demonstrating that no PTAS exists for these problems unless P=NP. Collectively, the results close the complexity gap between the polynomially solvable two-input case and the NP-hard multi-input setting, and they hint at connections to Rank Aggregation with three inputs and to broader convexity notions in permutation spaces.

Abstract

Genome rearrangements are events in which large blocks of DNA exchange pieces during evolution. The analysis of such events is a tool for understanding evolutionary genomics, based on finding the minimum number of rearrangements to transform one genome into another, which can be modeled as permutations of integers. In a general scenario, more than two genomes are considered, and new challenges arise. Given three input permutations, the Median problem consists of finding a permutation s that minimizes the sum of the distances between s and each of the three input permutations, according to a specified distance measure. We prove that Median problem over swap distances is NP-complete, a problem whose computational complexity has remained unsolved for nearly 20 years (Eriksen, Theor. Comput. Sci., 2007). To tackle this problem, we introduce a graph-based perspective by the class called 2-circles-intersection graphs. We show that for each 2-circles-intersection graph G, we can associate three permutations such that G has a large independent set iff the median of the three associated permutations reaches a specific lower bound. We then prove that maximum independent set is NP-complete in this graph class. By this approach, we also establish that the Closest problem which aims to minimize the maximum distance between the solution and the input permutations is NP-complete even with three input permutations. This last result closes the complexity gap in the dichotomy between P and NP-complete cases: with two input permutations, the problem is easily solvable, while for an arbitrary number of input permutations, the Closest problem was known to be NP-hard since 2007 (Popov, Theor. Comput. Sci., 2007). Additionally, we show that both the Swap Median and Swap Closest problems are APX-hard, further emphasizing the computational complexity of these genome-related problems through graph theory.
Paper Structure (7 sections, 14 theorems, 1 equation, 3 figures)

This paper contains 7 sections, 14 theorems, 1 equation, 3 figures.

Key Result

proposition thmcounterproposition

$(\star)$ There exist instances of the swap median problem where the number of distinct permutations belonging to all shortest paths grows exponentially with the size of the input permutations.

Figures (3)

  • Figure 1: On the left: A $2$-circle decomposition, with one circle in red and another in blue. Each arc represents a possible swap that can be applied to break a cycle, where $c(\pi_1,\pi_2) = (1\ 4\ 6\ 3)(2\ 5)(7\ 8)$ and $c(\pi_1,\pi_3) = (1\ 6\ 2\ 5)(7\ 4\ 8\ 3)$. On the right: The $2$-circles-intersection graph, where each arc (representing a possible swap) corresponds to a vertex, and two crossing arcs are adjacent in the graph.
  • Figure 2: Algebraic cycles graph $G(\pi)$ of the permutation $\pi = [8 \ 5\ 1\ 3\ 2\ 7\ 6\ 4]$ whose cycles are $(1 \ 8 \ 4 \ 3)(2 \ 5)(6 \ 7)$.
  • Figure 3: In the left: A $2$ subdivision of a graph with red and blue dashed ellipses corresponding to the two circular models we obtain the two algebraic cycles graphs. In the right: Parts of two circular models. In top are the red circles $C_v^1$ and $C_u^1$ of $C^1$, and in bottom are the blue circles $C_v^2$ and $C_{v,u}^2$ of $C^2$.

Theorems & Definitions (24)

  • proposition thmcounterproposition
  • definition thmcounterdefinition
  • theorem thmcountertheorem
  • proof
  • theorem thmcountertheorem
  • theorem thmcountertheorem
  • corollary thmcountercorollary
  • corollary thmcountercorollary
  • definition thmcounterdefinition
  • lemma thmcounterlemma
  • ...and 14 more