State Complexity of Multiple Concatenation
Jozef Jirásek, Galina Jirásková
TL;DR
This work advances the understanding of the state complexity of the multiple concatenation L_1 L_2 ... L_k of regular languages by constructing explicit witness DFAs over alphabets of size k+1 and k, and by deriving tight upper and lower bounds across binary, ternary, and unary regimes. The authors provide a significantly simpler proof for the upper bound using an NFA-driven subset construction and demonstrate that all valid states become distinguishable via co-reachability of singleton states. They prove the upper bound is tight for multiple alphabet sizes, show ternary alphabets are optimal for k=3, and give tight bounds for unary cyclic languages via Frobenius-number-type calculations. The results resolve open questions from Caron et al. and Gao–Yu while laying out several intriguing open problems for future work.
Abstract
We describe witness languages meeting the upper bound on the state complexity of the multiple concatenation of $k$ regular languages over an alphabet of size $k+1$ with a significantly simpler proof than that in the literature. We also consider the case where some languages may be recognized by two-state automata. Then we show that one symbol can be saved, and we define witnesses for the multiple concatenation of $k$ languages over a $k$-letter alphabet. This solves an open problem stated by Caron et al. [2018, Fundam. Inform. 160, 255--279]. We prove that for the concatenation of three languages, the ternary alphabet is optimal. We also show that a trivial upper bound on the state complexity of multiple concatenation is asymptotically tight for ternary languages, and that a lower bound remains exponential in the binary case. Finally, we obtain a tight upper bound for unary cyclic languages and languages recognized by unary automata that do not have final states in their tails.
