Constrained Multi-Tildes: Derived Term and Position Automata

Samira Attou; Ludovic Mignot; Clément Miklarz; Florent Nicart

Constrained Multi-Tildes: Derived Term and Position Automata

Samira Attou, Ludovic Mignot, Clément Miklarz, Florent Nicart

TL;DR

This work addresses expanding multi-tildes from disjunctive to arbitrary Boolean combinations, enabling richer and potentially exponentially smaller regular-expression specifications while preserving regularity. It builds a semantic framework using Boolean formulae over tilde positions and extends partial derivatives and Glushkov-style automata to constrained tildes, including a surlinearization technique to ensure correctness. The results demonstrate regularity preservation, provide finite automata constructions (derived-term and Glushkov) for constrained tildes, and deliver a practical Haskell implementation with tooling for visualization and a web interface. This establishes groundwork for automaton-to-expression inversion and paves the way for more compact representations of regular languages and efficient membership testing in the presence of complex tilde-constrained semantics.

Abstract

Multi-tildes are regular operators that were introduced to enhance the factorization power of regular expressions, allowing us to add the empty word in several factors of a catenation product of languages. In addition to multi-bars, which dually remove the empty word, they allow representing any acyclic automaton by a linear-sized expression, whereas the lower bound is exponential in the classic case. In this paper, we extend multi-tildes from disjunctive combinations to any Boolean combination, allowing us to exponentially enhance the factorization power of tildes expressions. Moreover, we show how to convert these expressions into finite automata and give a Haskell implementation of them using advanced techniques of functional programming.

Constrained Multi-Tildes: Derived Term and Position Automata

TL;DR

Abstract

Paper Structure (12 sections, 27 theorems, 51 equations, 5 figures)

This paper contains 12 sections, 27 theorems, 51 equations, 5 figures.

Introduction
Preliminaries
Boolean Formulae and Satisfiability
Constrained Multi-Tildes
Factorization Power
Partial Derivatives and Automaton Computation
The Glushkov Automaton of an Expression
The Computation for (Classical) Regular Expressions
Construction for Constrained Tildes
Correction of the Construction
Haskell Implementation
Conclusion and Perspectives

Key Result

Theorem 4

Let $\phi$ be a Boolean formula over the alphabet $\{1, \ldots, n\}$ and $L_1, \ldots, L_n$ be $n$ regular languages. Then $\phi(L_1, \ldots, L_n)$ is regular.

Figures (5)

Figure 1: The formula $\neg(a \wedge b) \wedge (a \wedge c)$ is satisfiable.
Figure 2: The derived term automaton of $E$.
Figure 3: The Glushkov automaton of $E$.
Figure 4: The Glushkov automaton of $E$.
Figure 5: The Web Interface.

Theorems & Definitions (45)

Example 1
Example 2
Example 3
Theorem 4
Lemma 5
Lemma 6
Lemma 7
Proposition 8
Proposition 9
Example 10
...and 35 more

Constrained Multi-Tildes: Derived Term and Position Automata

TL;DR

Abstract

Constrained Multi-Tildes: Derived Term and Position Automata

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (45)