Automata on Graph Alphabets
Hugo Bazille, Uli Fahrenberg
TL;DR
The paper develops an automata theory for graph-alphabet languages, where concatenation is constrained by endpoint compatibility on a directed graph. It establishes a graph-analog of the Kleene and Myhill-Nerode theorems, showing that rational and regular languages coincide and that regular languages have finite prefix/suffix quotients, with a Myhill-Nerode automaton providing minimal determinization under suitable finiteness conditions. It proves determinization and minimization results, and discusses complementation—not closed for infinite graphs but workable for finite graphs—along with outlining an extension to presimplicial alphabets and higher-dimensional automata. The work also highlights ST-automata as motivating examples and sketches future directions in logic and extensions to richer alphabets and higher-order types, aiming at a broader automata theory for structured data beyond free monoids.
Abstract
The theory of finite automata concerns itself with words in a free monoid together with concatenation and without further structure. There are, however, important applications which use alphabets which are structured in some sense. We introduce automata over a particular type of structured data, namely an alphabet which is given as a (finite or infinite) directed graph. This constrains concatenation: two strings may only be concatenated if the end vertex of the first is equal to the start vertex of the second. We develop the beginnings of an automata theory for languages on graph alphabets. We show that they admit a Kleene theorem, relating rational and regular languages, and a Myhill-Nerode theorem, stating that languages are regular iff they have finite prefix or, equivalently, suffix quotient. We present determinization and minimization algorithms, but we also exhibit that regular languages are not stable by complementation. Finally, we mention how these structures could be generalized to presimplicial alphabets, where languages are no more freely generated.
