Table of Contents
Fetching ...

Regular Languages in the Sliding Window Model

Moses Ganardi, Danny Hucke, Markus Lohrey, Konstantinos Mamouras, Tatiana Starikovskaya

TL;DR

This work analyzes the space complexity of recognizing the last-window membership of a fixed regular language L in sliding-window streams. It establishes a deterministic space trichotomy, linking constant, logarithmic, and linear space to structural language classes, and extends to a randomized setting that yields a four-way tetrachotomy, with precise class-based characterizations. The authors introduce the path-summary technique and right-deterministic automata to obtain sublinear space in many cases, and they develop sliding window testers (deterministic and randomized) that solve approximate testing with tight space bounds. They also connect these results to property testing, offering constant-space testers for many regular languages and detailing lower bounds via standard communication-complexity reductions. Overall, the paper maps the exact landscape of sliding-window language recognition in both deterministic and randomized models, including testers, and highlights the limits and opportunities for sublinear space in streaming contexts.

Abstract

We study the space complexity of the following problem: For a fixed regular language $L$, we receive a stream of symbols and want to test membership of a sliding window of size $n$ in $L$. For deterministic streaming algorithms we prove a trichotomy theorem, namely that the (optimal) space complexity is either constant, logarithmic or linear, measured in the window size $n$. Additionally, we provide natural language-theoretic characterizations of the space classes. We then extend the results to randomized streaming algorithms and we show that in this setting, the space complexity of any regular language is either constant, doubly logarithmic, logarithmic or linear. Finally, we introduce sliding window testers, which can distinguish whether a sliding window of size $n$ belongs to the language $L$ or has Hamming distance $> εn$ to $L$. We prove that every regular language has a deterministic (resp., randomized) sliding window tester that requires only logarithmic (resp., constant) space.

Regular Languages in the Sliding Window Model

TL;DR

This work analyzes the space complexity of recognizing the last-window membership of a fixed regular language L in sliding-window streams. It establishes a deterministic space trichotomy, linking constant, logarithmic, and linear space to structural language classes, and extends to a randomized setting that yields a four-way tetrachotomy, with precise class-based characterizations. The authors introduce the path-summary technique and right-deterministic automata to obtain sublinear space in many cases, and they develop sliding window testers (deterministic and randomized) that solve approximate testing with tight space bounds. They also connect these results to property testing, offering constant-space testers for many regular languages and detailing lower bounds via standard communication-complexity reductions. Overall, the paper maps the exact landscape of sliding-window language recognition in both deterministic and randomized models, including testers, and highlights the limits and opportunities for sublinear space in streaming contexts.

Abstract

We study the space complexity of the following problem: For a fixed regular language , we receive a stream of symbols and want to test membership of a sliding window of size in . For deterministic streaming algorithms we prove a trichotomy theorem, namely that the (optimal) space complexity is either constant, logarithmic or linear, measured in the window size . Additionally, we provide natural language-theoretic characterizations of the space classes. We then extend the results to randomized streaming algorithms and we show that in this setting, the space complexity of any regular language is either constant, doubly logarithmic, logarithmic or linear. Finally, we introduce sliding window testers, which can distinguish whether a sliding window of size belongs to the language or has Hamming distance to . We prove that every regular language has a deterministic (resp., randomized) sliding window tester that requires only logarithmic (resp., constant) space.
Paper Structure (48 sections, 73 theorems, 37 equations, 5 figures, 1 algorithm)

This paper contains 48 sections, 73 theorems, 37 equations, 5 figures, 1 algorithm.

Key Result

Theorem 1.3

Let $L \subseteq \Sigma^*$ be regular. The space complexity $\mathsf{F}_L(n)$ is either $\Theta(1)$, $\Theta^{{\infty}}(\log n)$, or $\Theta^{{\infty}}(n)$. Moreover, we have: The space complexity $\mathsf{V}_L(n)$ is either $\Theta(1)$, $\Theta(\log n)$, or $\Theta(n)$. Moreover, we have:

Figures (5)

  • Figure 4: The space complexity of regular languages in the fixed-size sliding window model. $\mathbf{Reg}$: regular languages, $\mathbf{LI}$: regular left ideals, $\mathbf{ST}$: suffix testable languages, $\mathbf{SF}$: regular suffix-free languages, $\mathbf{Len}$: regular length languages. The angle brackets $\langle \cdot \rangle$ denote Boolean closure.
  • Figure 5: A well-behaved rDFA consisting of three SCCs.
  • Figure 6: The space complexity of regular languages with respect to deterministic, true-biased and false-biased sliding window testers. As in \ref{['fig:big-picture']}, only upper bounds are shown, and they hold for every Hamming gap function $\gamma(n)$ provided that $\gamma(n) \geqslant c$ for a constant $c$ that depends on the language. All upper bounds can be matched with lower bounds that hold for every $\gamma(n) \leqslant \epsilon n$ for a constant $\epsilon$ that depends on the language.
  • Figure 7: The run $\pi = \pi' \rho_2$$t$-simulates the run $\rho = \rho_1 \rho_2$. We have $|\rho_1| \leqslant t$.
  • Figure 8: A compact summary of a run $\pi$.

Theorems & Definitions (78)

  • Example 1.1
  • Example 1.2
  • Theorem 1.3
  • Theorem 1.4
  • Theorem 1.5
  • Lemma 2.1
  • Lemma 2.2
  • Lemma 2.3
  • Lemma 2.4
  • Lemma 2.5
  • ...and 68 more