Table of Contents
Fetching ...

State Complexity of Multiple Concatenation

Jozef Jirásek, Galina Jirásková

TL;DR

This work advances the understanding of the state complexity of the multiple concatenation L_1 L_2 ... L_k of regular languages by constructing explicit witness DFAs over alphabets of size k+1 and k, and by deriving tight upper and lower bounds across binary, ternary, and unary regimes. The authors provide a significantly simpler proof for the upper bound using an NFA-driven subset construction and demonstrate that all valid states become distinguishable via co-reachability of singleton states. They prove the upper bound is tight for multiple alphabet sizes, show ternary alphabets are optimal for k=3, and give tight bounds for unary cyclic languages via Frobenius-number-type calculations. The results resolve open questions from Caron et al. and Gao–Yu while laying out several intriguing open problems for future work.

Abstract

We describe witness languages meeting the upper bound on the state complexity of the multiple concatenation of $k$ regular languages over an alphabet of size $k+1$ with a significantly simpler proof than that in the literature. We also consider the case where some languages may be recognized by two-state automata. Then we show that one symbol can be saved, and we define witnesses for the multiple concatenation of $k$ languages over a $k$-letter alphabet. This solves an open problem stated by Caron et al. [2018, Fundam. Inform. 160, 255--279]. We prove that for the concatenation of three languages, the ternary alphabet is optimal. We also show that a trivial upper bound on the state complexity of multiple concatenation is asymptotically tight for ternary languages, and that a lower bound remains exponential in the binary case. Finally, we obtain a tight upper bound for unary cyclic languages and languages recognized by unary automata that do not have final states in their tails.

State Complexity of Multiple Concatenation

TL;DR

This work advances the understanding of the state complexity of the multiple concatenation L_1 L_2 ... L_k of regular languages by constructing explicit witness DFAs over alphabets of size k+1 and k, and by deriving tight upper and lower bounds across binary, ternary, and unary regimes. The authors provide a significantly simpler proof for the upper bound using an NFA-driven subset construction and demonstrate that all valid states become distinguishable via co-reachability of singleton states. They prove the upper bound is tight for multiple alphabet sizes, show ternary alphabets are optimal for k=3, and give tight bounds for unary cyclic languages via Frobenius-number-type calculations. The results resolve open questions from Caron et al. and Gao–Yu while laying out several intriguing open problems for future work.

Abstract

We describe witness languages meeting the upper bound on the state complexity of the multiple concatenation of regular languages over an alphabet of size with a significantly simpler proof than that in the literature. We also consider the case where some languages may be recognized by two-state automata. Then we show that one symbol can be saved, and we define witnesses for the multiple concatenation of languages over a -letter alphabet. This solves an open problem stated by Caron et al. [2018, Fundam. Inform. 160, 255--279]. We prove that for the concatenation of three languages, the ternary alphabet is optimal. We also show that a trivial upper bound on the state complexity of multiple concatenation is asymptotically tight for ternary languages, and that a lower bound remains exponential in the binary case. Finally, we obtain a tight upper bound for unary cyclic languages and languages recognized by unary automata that do not have final states in their tails.

Paper Structure

This paper contains 8 sections, 20 theorems, 46 equations, 17 figures.

Key Result

Lemma 1

Let $N=(Q,\Sigma,\cdot,I,F)$ be an NFA. Let $S,T\subseteq Q$ and $q\in S\setminus T$. If the singleton set $\{q\}$ is co-reachable in $N$, then $S$ and $T$ are distinguishable in the subset automaton $\mathcal{D}(N)$.

Figures (17)

  • Figure 1: DFAs $A_1$ and $A_2$ with all valid states reachable in $\mathcal{D}(N)$.
  • Figure 2: The NFA $N$ recognizing the language $L(A_1)L(A_2)$.
  • Figure 3: The NFA recognizing the language $L(A_1')L(A_2')$.
  • Figure 4: The witness DFA $A_i$ over the $(k+1)$-letter alphabet $\{b, a_1,a_2,\ldots,a_k\}$; $n_i\ge3$.
  • Figure 5: The NFA $N_i$ recognizing the language $L(A_{i-1})L(A-i)$.
  • ...and 12 more figures

Theorems & Definitions (47)

  • Lemma 1
  • proof
  • Corollary 2
  • Definition 3
  • Proposition 4
  • Proposition 5
  • proof
  • Theorem 6
  • proof
  • Proposition 8
  • ...and 37 more