Table of Contents
Fetching ...

Constrained Multi-Tildes: Derived Term and Position Automata

Samira Attou, Ludovic Mignot, Clément Miklarz, Florent Nicart

TL;DR

This work addresses expanding multi-tildes from disjunctive to arbitrary Boolean combinations, enabling richer and potentially exponentially smaller regular-expression specifications while preserving regularity. It builds a semantic framework using Boolean formulae over tilde positions and extends partial derivatives and Glushkov-style automata to constrained tildes, including a surlinearization technique to ensure correctness. The results demonstrate regularity preservation, provide finite automata constructions (derived-term and Glushkov) for constrained tildes, and deliver a practical Haskell implementation with tooling for visualization and a web interface. This establishes groundwork for automaton-to-expression inversion and paves the way for more compact representations of regular languages and efficient membership testing in the presence of complex tilde-constrained semantics.

Abstract

Multi-tildes are regular operators that were introduced to enhance the factorization power of regular expressions, allowing us to add the empty word in several factors of a catenation product of languages. In addition to multi-bars, which dually remove the empty word, they allow representing any acyclic automaton by a linear-sized expression, whereas the lower bound is exponential in the classic case. In this paper, we extend multi-tildes from disjunctive combinations to any Boolean combination, allowing us to exponentially enhance the factorization power of tildes expressions. Moreover, we show how to convert these expressions into finite automata and give a Haskell implementation of them using advanced techniques of functional programming.

Constrained Multi-Tildes: Derived Term and Position Automata

TL;DR

This work addresses expanding multi-tildes from disjunctive to arbitrary Boolean combinations, enabling richer and potentially exponentially smaller regular-expression specifications while preserving regularity. It builds a semantic framework using Boolean formulae over tilde positions and extends partial derivatives and Glushkov-style automata to constrained tildes, including a surlinearization technique to ensure correctness. The results demonstrate regularity preservation, provide finite automata constructions (derived-term and Glushkov) for constrained tildes, and deliver a practical Haskell implementation with tooling for visualization and a web interface. This establishes groundwork for automaton-to-expression inversion and paves the way for more compact representations of regular languages and efficient membership testing in the presence of complex tilde-constrained semantics.

Abstract

Multi-tildes are regular operators that were introduced to enhance the factorization power of regular expressions, allowing us to add the empty word in several factors of a catenation product of languages. In addition to multi-bars, which dually remove the empty word, they allow representing any acyclic automaton by a linear-sized expression, whereas the lower bound is exponential in the classic case. In this paper, we extend multi-tildes from disjunctive combinations to any Boolean combination, allowing us to exponentially enhance the factorization power of tildes expressions. Moreover, we show how to convert these expressions into finite automata and give a Haskell implementation of them using advanced techniques of functional programming.
Paper Structure (12 sections, 27 theorems, 51 equations, 5 figures)

This paper contains 12 sections, 27 theorems, 51 equations, 5 figures.

Key Result

Theorem 4

Let $\phi$ be a Boolean formula over the alphabet $\{1, \ldots, n\}$ and $L_1, \ldots, L_n$ be $n$ regular languages. Then $\phi(L_1, \ldots, L_n)$ is regular.

Figures (5)

  • Figure 1: The formula $\neg(a \wedge b) \wedge (a \wedge c)$ is satisfiable.
  • Figure 2: The derived term automaton of $E$.
  • Figure 3: The Glushkov automaton of $E$.
  • Figure 4: The Glushkov automaton of $E$.
  • Figure 5: The Web Interface.

Theorems & Definitions (45)

  • Example 1
  • Example 2
  • Example 3
  • Theorem 4
  • Lemma 5
  • Lemma 6
  • Lemma 7
  • Proposition 8
  • Proposition 9
  • Example 10
  • ...and 35 more