Table of Contents
Fetching ...

Bounded Edit Distance: Optimal Static and Dynamic Algorithms for Small Integer Weights

Egor Gorbachev, Tomasz Kociumaka

TL;DR

This work addresses the bounded edit distance problem for strings with small integer weights, including its dynamic variant. It develops a unified framework based on the alignment-graph representation and Monge-structure to reduce general instances to those with self-edit distance \\mathsf{self-\mathsf{ed}}(X)=\\mathcal{O}(k), enabling near-optimal static and dynamic algorithms. The key technical advances are a robust divide-and-conquer scheme guided by approximate w-optimal alignments, a core-sparse Monge matrix multiplication toolkit with a Core-based Matrix Oracle, and efficient handling of copy-paste updates via straight-line programs, together with strategies to cope with large weights. Consequently, the authors achieve a static weighted bounded edit-distance time of \\tilde{O}(n+Wk^2) and dynamic updates in \\tilde{O}(W^2k), plus faster static regimes such as \\tilde{O}(n+k^{2.5}) for certain weight regimes, all under standard fine-grained complexity assumptions. These results bridge static and dynamic settings and advance near-optimal bounds for weighted, bounded edit distance in a variety of regimes.

Abstract

The edit distance of two strings is the minimum number of insertions, deletions, and substitutions needed to transform one string into the other. The textbook algorithm determines the edit distance of length-$n$ strings in $O(n^2)$ time, which is optimal up to subpolynomial factors under Orthogonal Vectors Hypothesis. In the bounded version of the problem, parameterized by the edit distance $k$, the algorithm of Landau and Vishkin [JCSS'88] achieves $O(n+k^2)$ time, which is optimal as a function of $n$ and $k$. The dynamic version of the problem asks to maintain the edit distance of two strings that change dynamically, with each update modeled as an edit. A folklore approach supports updates in $\tilde O(k^2)$ time, where $\tilde O(\cdot)$ hides polylogarithmic factors. Recently, Charalampopoulos, Kociumaka, and Mozes [CPM'20] showed an algorithm with update time $\tilde O(n)$, which is optimal under OVH in terms of $n$. The update time of $\tilde O(\min\{n,k^2\})$ raised an exciting open question of whether $\tilde O(k)$ is possible; we answer it affirmatively. Our solution relies on tools originating from weighted edit distance, where the weight of each edit depends on the edit type and the characters involved. The textbook algorithm supports weights, but the Landau-Vishkin approach does not, and a simple $O(nk)$-time procedure long remained the fastest for bounded weighted edit distance. Only recently, Das et al. [STOC'23] provided an $O(n+k^5)$-time algorithm, whereas Cassis, Kociumaka, and Wellnitz [FOCS'23] presented an $\tilde O(n+\sqrt{nk^3})$-time solution and a matching conditional lower bound. In this paper, we show that, for integer edit weights between $0$ and $W$, weighted edit distance can be computed in $\tilde O(n+Wk^2)$ time and maintained dynamically in $\tilde O(W^2k)$ time per update. Our static algorithm can also be implemented in $\tilde O(n+k^{2.5})$ time.

Bounded Edit Distance: Optimal Static and Dynamic Algorithms for Small Integer Weights

TL;DR

This work addresses the bounded edit distance problem for strings with small integer weights, including its dynamic variant. It develops a unified framework based on the alignment-graph representation and Monge-structure to reduce general instances to those with self-edit distance \\mathsf{self-\mathsf{ed}}(X)=\\mathcal{O}(k), enabling near-optimal static and dynamic algorithms. The key technical advances are a robust divide-and-conquer scheme guided by approximate w-optimal alignments, a core-sparse Monge matrix multiplication toolkit with a Core-based Matrix Oracle, and efficient handling of copy-paste updates via straight-line programs, together with strategies to cope with large weights. Consequently, the authors achieve a static weighted bounded edit-distance time of \\tilde{O}(n+Wk^2) and dynamic updates in \\tilde{O}(W^2k), plus faster static regimes such as \\tilde{O}(n+k^{2.5}) for certain weight regimes, all under standard fine-grained complexity assumptions. These results bridge static and dynamic settings and advance near-optimal bounds for weighted, bounded edit distance in a variety of regimes.

Abstract

The edit distance of two strings is the minimum number of insertions, deletions, and substitutions needed to transform one string into the other. The textbook algorithm determines the edit distance of length- strings in time, which is optimal up to subpolynomial factors under Orthogonal Vectors Hypothesis. In the bounded version of the problem, parameterized by the edit distance , the algorithm of Landau and Vishkin [JCSS'88] achieves time, which is optimal as a function of and . The dynamic version of the problem asks to maintain the edit distance of two strings that change dynamically, with each update modeled as an edit. A folklore approach supports updates in time, where hides polylogarithmic factors. Recently, Charalampopoulos, Kociumaka, and Mozes [CPM'20] showed an algorithm with update time , which is optimal under OVH in terms of . The update time of raised an exciting open question of whether is possible; we answer it affirmatively. Our solution relies on tools originating from weighted edit distance, where the weight of each edit depends on the edit type and the characters involved. The textbook algorithm supports weights, but the Landau-Vishkin approach does not, and a simple -time procedure long remained the fastest for bounded weighted edit distance. Only recently, Das et al. [STOC'23] provided an -time algorithm, whereas Cassis, Kociumaka, and Wellnitz [FOCS'23] presented an -time solution and a matching conditional lower bound. In this paper, we show that, for integer edit weights between and , weighted edit distance can be computed in time and maintained dynamically in time per update. Our static algorithm can also be implemented in time.
Paper Structure (4 sections, 6 theorems, 3 equations, 3 figures)

This paper contains 4 sections, 6 theorems, 3 equations, 3 figures.

Key Result

Theorem 1

Fix an alphabet $\Sigma$ and a weight function $w : \overline{\Sigma}^2 \to \mathbb{Z}\xspace_{\ge 0}$. Given strings $X,Y\in \Sigma^{\le n}$, the weighted edit distance $k \coloneqq \mathsf{ed}^w(X,Y)$ can be computed in $\mathcal{O}\xspace(n+k^2 \log^2 n)$ time. \begin{tikzpicture}[baseline=(t-tex

Figures (3)

  • Figure 1: The alignment graph $\mathop{\mathrm{AG}}\nolimits^w(X, Y)$ for $X = \mathtt{baaa}$ and $Y = \mathtt{bab}$. In the unweighted case, the green edges have weight $0$, and the red ones have weight $1$. In the weighted case, the red edges have some weights defined by the weight function $w$. An optimal alignment for a weight function $w$ satisfying $w(\mathtt{a}, \varepsilon) = w(\varepsilon, \mathtt{b}) = 1$ and $w(\mathtt{a}, \mathtt{b}) = w(\mathtt{b}, \mathtt{a}) = w(\varepsilon, \mathtt{a}) = w(\mathtt{b}, \varepsilon) = 3$ is given in gray. The input vertices for the boundary matrix $\mathop{\mathrm{BM}}\nolimits^w(X, Y)$ have orange labels, and the output vertices have violet labels.
  • Figure 2: Our algorithms' setup for the general case and the case of small self-edit distance, respectively.
  • Figure 3: The sum of the distances from $u_i$ to $v_{j + 1}$ and from $u_{i + 1}$ to $v_{j}$ is not smaller than the sum of the distances from $u_i$ to $v_j$ and from $u_{i + 1}$ to $v_{j + 1}$.

Theorems & Definitions (14)

  • Theorem 1
  • Remark 2
  • Theorem 3
  • Theorem 4
  • Remark 5
  • Theorem 6
  • Theorem 7
  • Definition 8: Alignment, DGHKS23
  • Definition 9: Alignment Graph, CKW23
  • Definition 13
  • ...and 4 more