Table of Contents
Fetching ...

Subsequences With Generalised Gap Constraints: Upper and Lower Complexity Bounds

Florin Manea, Jonas Richardsen, Markus L. Schmid

TL;DR

This work introduces and analyzes subsequences with generalised gap constraints, where each gap between pattern positions must belong to a specified language. It establishes NP-hardness for both regular and semilinear gap constraints and provides a detailed parameterised complexity landscape, including W[1]-hardness for the key parameter |p| and tractability results when the constraint structure is restricted. Two structural representations of constraint sets—the interval structure and a graph-based view—enable refined results: polynomial-time solvability when the vertex-separation number is bounded, and efficient algorithms for non-intersecting (outerplanar) gap constraints using fast matrix multiplication. The paper also proves strong conditional lower bounds under SETH via reductions from 3-OV, showing a clear boundary between tractable and intractable cases and highlighting practical subclasses with efficient algorithms.

Abstract

For two strings u, v over some alphabet A, we investigate the problem of embedding u into w as a subsequence under the presence of generalised gap constraints. A generalised gap constraint is a triple (i, j, C_{i, j}), where 1 <= i < j <= |u| and C_{i, j} is a subset of A^*. Embedding u as a subsequence into v such that (i, j, C_{i, j}) is satisfied means that if u[i] and u[j] are mapped to v[k] and v[l], respectively, then the induced gap v[k + 1..l - 1] must be a string from C_{i, j}. This generalises the setting recently investigated in [Day et al., ISAAC 2022], where only gap constraints of the form C_{i, i + 1} are considered, as well as the setting from [Kosche et al., RP 2022], where only gap constraints of the form C_{1, |u|} are considered. We show that subsequence matching under generalised gap constraints is NP-hard, and we complement this general lower bound with a thorough (parameterised) complexity analysis. Moreover, we identify several efficiently solvable subclasses that result from restricting the interval structure induced by the generalised gap constraints.

Subsequences With Generalised Gap Constraints: Upper and Lower Complexity Bounds

TL;DR

This work introduces and analyzes subsequences with generalised gap constraints, where each gap between pattern positions must belong to a specified language. It establishes NP-hardness for both regular and semilinear gap constraints and provides a detailed parameterised complexity landscape, including W[1]-hardness for the key parameter |p| and tractability results when the constraint structure is restricted. Two structural representations of constraint sets—the interval structure and a graph-based view—enable refined results: polynomial-time solvability when the vertex-separation number is bounded, and efficient algorithms for non-intersecting (outerplanar) gap constraints using fast matrix multiplication. The paper also proves strong conditional lower bounds under SETH via reductions from 3-OV, showing a clear boundary between tractable and intractable cases and highlighting practical subclasses with efficient algorithms.

Abstract

For two strings u, v over some alphabet A, we investigate the problem of embedding u into w as a subsequence under the presence of generalised gap constraints. A generalised gap constraint is a triple (i, j, C_{i, j}), where 1 <= i < j <= |u| and C_{i, j} is a subset of A^*. Embedding u as a subsequence into v such that (i, j, C_{i, j}) is satisfied means that if u[i] and u[j] are mapped to v[k] and v[l], respectively, then the induced gap v[k + 1..l - 1] must be a string from C_{i, j}. This generalises the setting recently investigated in [Day et al., ISAAC 2022], where only gap constraints of the form C_{i, i + 1} are considered, as well as the setting from [Kosche et al., RP 2022], where only gap constraints of the form C_{1, |u|} are considered. We show that subsequence matching under generalised gap constraints is NP-hard, and we complement this general lower bound with a thorough (parameterised) complexity analysis. Moreover, we identify several efficiently solvable subclasses that result from restricting the interval structure induced by the generalised gap constraints.
Paper Structure (23 sections, 19 theorems, 48 equations, 8 figures, 7 algorithms)

This paper contains 23 sections, 19 theorems, 48 equations, 8 figures, 7 algorithms.

Key Result

Lemma 2.2

$k$-$\mathop{\mathrm{\textsf{OV}}}\nolimits$ cannot be solved in $n^{k-\varepsilon} \operatorname{poly}(d)$ time for any $\varepsilon > 0$, unless SETH fails.

Figures (8)

  • Figure 1: Relations between constraints
  • Figure 2: Hasse diagrams for inclusion and interval order on $\mathcal{I}_3$
  • Figure 3: Embedding $p[s..t]$ into $w[i..j]$
  • Figure 4: Joining partial embeddings
  • Figure 5: Hasse diagram of constraints
  • ...and 3 more figures

Theorems & Definitions (35)

  • Lemma 2.2
  • Lemma 3.1
  • Lemma 3.2
  • Remark 3.3
  • Theorem 4.1
  • Theorem 4.2
  • Theorem 4.3
  • Theorem 5.1
  • Theorem 5.2
  • Theorem 5.3
  • ...and 25 more