Table of Contents
Fetching ...

Quantum Algorithms for the Shortest Common Superstring and Text Assembling Problems

Kamil Khadiev, Carlos Manuel Bosch Machado, Zeyu Chen, Junde Wu

TL;DR

The paper addresses two problems related to assembling strings from a dictionary: the Shortest Common Superstring (SCS) and the Text Assembling Problem (TAO). It proposes quantum algorithms that outperform known classical approaches, achieving $O(m+\log m\sqrt{nL})$ for TAO and $O(n^{3}1.728^n+L+n^{1.5}\sqrt{L}+\sqrt{L}n\log^2L\log^2n)$ for SCS, by combining Grover-type search, quantum string matching, and dynamic programming on a Boolean cube. Key techniques include rolling hash-based substring checks, suffix arrays, and segment trees, plus a novel graph-construction approach (ConstructTheGraph and ConstructTheGraph2) to map SCS onto a maximum-weight Hamiltonian path problem. The results establish the first quantum algorithm for SCS and offer practical quantum speedups for DNA-sequence-inspired text assembly problems, while outlining open questions such as typos robustness and tighter lower/upper bounds. Overall, the work advances quantum algorithm design for combinatorial string assembly with potential impact on bioinformatics and data compression tasks.

Abstract

In this paper, we consider two versions of the Text Assembling problem. We are given a sequence of strings $s^1,\dots,s^n$ of total length $L$ that is a dictionary, and a string $t$ of length $m$ that is texts. The first version of the problem is assembling $t$ from the dictionary. The second version is the ``Shortest Superstring Problem''(SSP) or the ``Shortest Common Superstring Problem''(SCS). In this case, $t$ is not given, and we should construct the shortest string (we call it superstring) that contains each string from the given sequence as a substring. These problems are connected with the sequence assembly method for reconstructing a long DNA sequence from small fragments. For both problems, we suggest new quantum algorithms that work better than their classical counterparts. In the first case, we present a quantum algorithm with $O(m+\log m\sqrt{nL})$ running time. In the case of SSP, we present a quantum algorithm with running time $O(n^3 1.728^n +L +\sqrt{L}n^{1.5}+\sqrt{L}n\log^2L\log^2n)$.

Quantum Algorithms for the Shortest Common Superstring and Text Assembling Problems

TL;DR

The paper addresses two problems related to assembling strings from a dictionary: the Shortest Common Superstring (SCS) and the Text Assembling Problem (TAO). It proposes quantum algorithms that outperform known classical approaches, achieving for TAO and for SCS, by combining Grover-type search, quantum string matching, and dynamic programming on a Boolean cube. Key techniques include rolling hash-based substring checks, suffix arrays, and segment trees, plus a novel graph-construction approach (ConstructTheGraph and ConstructTheGraph2) to map SCS onto a maximum-weight Hamiltonian path problem. The results establish the first quantum algorithm for SCS and offer practical quantum speedups for DNA-sequence-inspired text assembly problems, while outlining open questions such as typos robustness and tighter lower/upper bounds. Overall, the work advances quantum algorithm design for combinatorial string assembly with potential impact on bioinformatics and data compression tasks.

Abstract

In this paper, we consider two versions of the Text Assembling problem. We are given a sequence of strings of total length that is a dictionary, and a string of length that is texts. The first version of the problem is assembling from the dictionary. The second version is the ``Shortest Superstring Problem''(SSP) or the ``Shortest Common Superstring Problem''(SCS). In this case, is not given, and we should construct the shortest string (we call it superstring) that contains each string from the given sequence as a substring. These problems are connected with the sequence assembly method for reconstructing a long DNA sequence from small fragments. For both problems, we suggest new quantum algorithms that work better than their classical counterparts. In the first case, we present a quantum algorithm with running time. In the case of SSP, we present a quantum algorithm with running time .
Paper Structure (15 sections, 10 theorems, 36 equations, 14 algorithms)

This paper contains 15 sections, 10 theorems, 36 equations, 14 algorithms.

Key Result

Lemma 1

A suffix array for a string $u$ can be constructed in $O(|u|)$ running time.

Theorems & Definitions (10)

  • Lemma 1: llh2018
  • Lemma 2
  • Lemma 3
  • Lemma 4
  • Corollary 1
  • Theorem 1
  • Lemma 5
  • Corollary 2
  • Lemma 6: kiv2022
  • Theorem 2