Table of Contents
Fetching ...

Dynamic direct access of MSO query evaluation over strings

Pierre Bourhis, Florent Capelli, Stefan Mengel, Cristian Riveros

TL;DR

This work presents the first efficient direct access algorithm for MSO query evaluation over strings and gives the answers in lexicographic order, in contrast to the setting of conjunctive queries, where the order between variables can be freely chosen by the user without degrading the runtime.

Abstract

We study the problem of evaluating a Monadic Second Order (MSO) query over strings under updates in the setting of direct access. We present an algorithm that, given an MSO query with first-order free variables represented by an unambiguous variable-set automaton $\mathcal{A}$ with state set $Q$ and variables $X$ and a string $s$, computes a data structure in time $\mathcal{O}(|Q|^ω\cdot |X|^2 \cdot |s|)$ and, then, given an index $i$ retrieves, using the data structure, the $i$-th output of the evaluation of $\mathcal{A}$ over $s$ in time $\mathcal{O}(|Q|^ω\cdot |X|^3 \cdot \log(|s|)^2)$ where $ω$ is the exponent for matrix multiplication. Ours is the first efficient direct access algorithm for MSO query evaluation over strings; such algorithms so far had only been studied for first-order queries and conjunctive queries over relational data. Our algorithm gives the answers in lexicographic order where, in contrast to the setting of conjunctive queries, the order between variables can be freely chosen by the user without degrading the runtime. Moreover, our data structure can be updated efficiently after changes to the input string, allowing more powerful updates than in the enumeration literature, e.g.~efficient deletion of substrings, concatenation and splitting of strings, and cut-and-paste operations. Our approach combines a matrix representation of MSO queries and a novel data structure for dynamic word problems over semi-groups which yields an overall algorithm that is elegant and easy to formulate.

Dynamic direct access of MSO query evaluation over strings

TL;DR

This work presents the first efficient direct access algorithm for MSO query evaluation over strings and gives the answers in lexicographic order, in contrast to the setting of conjunctive queries, where the order between variables can be freely chosen by the user without degrading the runtime.

Abstract

We study the problem of evaluating a Monadic Second Order (MSO) query over strings under updates in the setting of direct access. We present an algorithm that, given an MSO query with first-order free variables represented by an unambiguous variable-set automaton with state set and variables and a string , computes a data structure in time and, then, given an index retrieves, using the data structure, the -th output of the evaluation of over in time where is the exponent for matrix multiplication. Ours is the first efficient direct access algorithm for MSO query evaluation over strings; such algorithms so far had only been studied for first-order queries and conjunctive queries over relational data. Our algorithm gives the answers in lexicographic order where, in contrast to the setting of conjunctive queries, the order between variables can be freely chosen by the user without degrading the runtime. Moreover, our data structure can be updated efficiently after changes to the input string, allowing more powerful updates than in the enumeration literature, e.g.~efficient deletion of substrings, concatenation and splitting of strings, and cut-and-paste operations. Our approach combines a matrix representation of MSO queries and a novel data structure for dynamic word problems over semi-groups which yields an overall algorithm that is elegant and easy to formulate.
Paper Structure (19 sections, 14 theorems, 9 equations, 2 figures, 1 algorithm)

This paper contains 19 sections, 14 theorems, 9 equations, 2 figures, 1 algorithm.

Key Result

Lemma 1

Let $\mathcal{A} = (Q, \Sigma, X, \Delta, q_0, F)$ be a functional vset automaton. For every $q \in Q$ there exists a set $X_q \subseteq X$ such that for every partial run $\rho$ from $q_0$ to $q$ it holds that $\mathsf{vars}(\rho) = X_q$.

Figures (2)

  • Figure 1: A running example of a vset automaton $\mathcal{A}_0$ that will be used throughout this work.
  • Figure 2: AVL-product representation for the string $abbbab$ and automaton $\mathcal{A}_0$ from \ref{['fig:vsetautomata']}.

Theorems & Definitions (21)

  • Lemma 1
  • Example 2
  • Theorem 3
  • Proposition 4
  • Example 5
  • Lemma 6: Lemma 7 in CapelliI24
  • Example 7
  • Lemma 8
  • Theorem 10: AdelsonVL62
  • Example 11
  • ...and 11 more