Constrained Multi-Tildes: Derived Term and Position Automata
Samira Attou, Ludovic Mignot, Clément Miklarz, Florent Nicart
TL;DR
This work addresses expanding multi-tildes from disjunctive to arbitrary Boolean combinations, enabling richer and potentially exponentially smaller regular-expression specifications while preserving regularity. It builds a semantic framework using Boolean formulae over tilde positions and extends partial derivatives and Glushkov-style automata to constrained tildes, including a surlinearization technique to ensure correctness. The results demonstrate regularity preservation, provide finite automata constructions (derived-term and Glushkov) for constrained tildes, and deliver a practical Haskell implementation with tooling for visualization and a web interface. This establishes groundwork for automaton-to-expression inversion and paves the way for more compact representations of regular languages and efficient membership testing in the presence of complex tilde-constrained semantics.
Abstract
Multi-tildes are regular operators that were introduced to enhance the factorization power of regular expressions, allowing us to add the empty word in several factors of a catenation product of languages. In addition to multi-bars, which dually remove the empty word, they allow representing any acyclic automaton by a linear-sized expression, whereas the lower bound is exponential in the classic case. In this paper, we extend multi-tildes from disjunctive combinations to any Boolean combination, allowing us to exponentially enhance the factorization power of tildes expressions. Moreover, we show how to convert these expressions into finite automata and give a Haskell implementation of them using advanced techniques of functional programming.
