Table of Contents
Fetching ...

Quantum Algorithm for the Multiple String Matching Problem

Kamil Khadiev, Danil Serov

TL;DR

The proposed quantum algorithm is viewed as a quantum analogue of the Aho-Corasick algorithm and its complexity is equal to the quantum lower bound $O(n + \sqrt{mL})$, up to a log factor.

Abstract

Let us consider the Multiple String Matching Problem. In this problem, we consider a long string, denoted by $t$, of length $n$. This string is referred to as a text. We also consider a sequence of $m$ strings, denoted by $S$, which we refer to as a dictionary. The total length of all strings from the dictionary is represented by the variable L. The objective is to identify all instances of strings from the dictionary within the text. The standard classical solution to this problem is Aho-Corasick Algorithm that has $O(n+L)$ query and time complexity. At the same time, the classical lower bound for the problem is the same $Ω(n+L)$. We propose a quantum algorithm with $O(n+\sqrt{mL\log n}+m\log n)$ query complexity and $O(n+\sqrt{mL\log n}\log b+m\log n)=O^*(n+\sqrt{mL})$ time complexity, where $b$ is the maximal length of strings from the dictionary. This improvement is particularly significant in the case of dictionaries comprising long words. Our algorithm's complexity is equal to the quantum lower bound $O(n + \sqrt{mL})$, up to a log factor. In some sense, our algorithm can be viewed as a quantum analogue of the Aho-Corasick algorithm.

Quantum Algorithm for the Multiple String Matching Problem

TL;DR

The proposed quantum algorithm is viewed as a quantum analogue of the Aho-Corasick algorithm and its complexity is equal to the quantum lower bound , up to a log factor.

Abstract

Let us consider the Multiple String Matching Problem. In this problem, we consider a long string, denoted by , of length . This string is referred to as a text. We also consider a sequence of strings, denoted by , which we refer to as a dictionary. The total length of all strings from the dictionary is represented by the variable L. The objective is to identify all instances of strings from the dictionary within the text. The standard classical solution to this problem is Aho-Corasick Algorithm that has query and time complexity. At the same time, the classical lower bound for the problem is the same . We propose a quantum algorithm with query complexity and time complexity, where is the maximal length of strings from the dictionary. This improvement is particularly significant in the case of dictionaries comprising long words. Our algorithm's complexity is equal to the quantum lower bound , up to a log factor. In some sense, our algorithm can be viewed as a quantum analogue of the Aho-Corasick algorithm.

Paper Structure

This paper contains 7 sections, 8 theorems, 4 equations, 2 algorithms.

Key Result

lemma 1

A suffix array for a string $u$ can be constructed with $O(|u|)$ query and time complexity.

Theorems & Definitions (11)

  • lemma 1: llh2018
  • lemma 2: lklaap2001klaap2001
  • lemma 3: bfc2000
  • lemma 4: kkmsy2022
  • corollary 1
  • lemma 5
  • proof
  • theorem 1
  • proof
  • theorem 2
  • ...and 1 more