Random Deterministic Automata With One Added Transition
Arnaud Carayol, Philippe Duchon, Florent Koechlin, Cyril Nicaud
TL;DR
This paper investigates the state complexity of languages recognized by random almost deterministic automata obtained by adding a single random transition to a uniform random n-state deterministic automaton. It establishes that for any fixed d≥1, there exists a positive probability that the minimal DFA recognizing the resulting language has more than $n^d$ states, implying the expected state complexity grows faster than any polynomial. The proof develops a probabilistic framework built on backward substructures, forward trees, and b-threads, and reduces key calculations to Galton–Watson processes with Poisson(2) offspring, facilitated by the novel template formalism. The results show a non-negligible propensity for combinatorial explosion in the powerset construction even under minimal non-determinism, and they provide insight into the distribution of regular languages induced by random deterministic automata.
Abstract
Every language recognized by a non-deterministic finite automaton can be recognized by a deterministic automaton, at the cost of a potential increase of the number of states, which in the worst case can go from $n$ states to $2^n$ states. In this article, we investigate this classical result in a probabilistic setting where we take a deterministic automaton with $n$ states uniformly at random and add just one random transition. These automata are almost deterministic in the sense that only one state has a non-deterministic choice when reading an input letter. In our model, each state has a fixed probability to be final. We prove that for any $d\geq 1$, with non-negligible probability the minimal (deterministic) automaton of the language recognized by such an automaton has more than $n^d$ states; as a byproduct, the expected size of its minimal automaton grows faster than any polynomial. Our result also holds when each state is final with some probability that depends on $n$, as long as it is not too close to $0$ and $1$, at distance at least $Ω(\frac1{\sqrt{n}})$ to be precise, therefore allowing models with a sublinear number of final states in expectation.
