Scoped MSO, Register Automata, and Expressions: Equivalence over Data Words
Radosław Piórkowski
TL;DR
This work develops a unified descriptive theory for data-word languages by introducing Data-Regular Expressions and Scoped MSO as minimalist yet expressive formalisms that match the capabilities of nondeterministic register automata with guessing ($\mathrm{NRA}^{\mathrm{g}}$). It proves expressive equivalences: $\mathcal{L}(\mathrm{DRE})=\mathcal{L}(\mathrm{NRA}^{\mathrm{g}})$ and $\mathcal{L}(\mathrm{MSO}_{\mathrm{S}}[<,\Sigma,\bA])=\mathcal{L}(\mathrm{NRA}^{\mathrm{g}})$ under atoms that admit elimination of strong guessing, with extensions to infinite words and arbitrary relational structures via contraction-based encodings. The new formalisms provide a robust bridge among automata, logic, and expressions for data words, enabling decidability results and new avenues for addressing open questions such as separability and expressive boundaries. The framework supports systematic translations in both directions between these descriptions and lays groundwork for tackling broader data-language hierarchies.
Abstract
This paper establishes logical and expression-based characterizations for the class of languages recognized by nondeterministic register automata with guessing (NRA) over infinite alphabets. We introduce Scoped MSO, a logic featuring a novel segment modality and syntactic restrictions on data comparisons. We prove this logic is expressively equivalent to NRA over data domains where ``strong guessing'' can be eliminated. Furthermore, we define Data-Regular Expressions, a minimalist regular-expression calculus built from quantifier-free regions and equipped with $k$-contracting concatenation, and demonstrate its equivalence to NRA over arbitrary relational structures. Together, these formalisms provide a robust descriptive theory for register automata, bridging the gap between automata, logic, and expressions.
