A Lower Bound on Unambiguous Context Free Grammars via Communication Complexity
Stefan Mengel, Harry Vinall-Smeeth
TL;DR
The paper addresses the problem of how succinctly finite languages can be represented by CFGs versus unambiguous CFGs. It introduces a rectangle-cover framework that ties grammar size to disjoint rectangle covers, and employs a discrepancy-based argument from communication complexity to derive exponential lower bounds for uCFG representations of $L_n$. The main result is a doubly exponential separation between general CFGs and uCFGs for $L_n$, with implications that even NFAs can be exponentially more succinct than uCFGs for finite languages. The approach advances understanding of unambiguity and suggests broader applicability to factorised representations and knowledge compilation.
Abstract
Motivated by recent connections to factorised databases, we analyse the efficiency of representations by context free grammars (CFGs). Concretely, we prove a recent conjecture by Kimelfeld, Martens, and Niewerth (ICDT 2025), that for finite languages representations by general CFGs can be doubly-exponentially smaller than those by unambiguous CFGs. To do so, we show the first exponential lower bounds for representation by unambiguous CFGs of a finite language that can efficiently be represented by CFGs. Our proof first reduces the problem to proving a lower bound in a non-standard model of communication complexity. Then, we argue similarly in spirit to a recent discrepancy argument to show the required communication complexity lower bound. Our result also implies that a finite language may admit an exponentially smaller representation as a nondeterministic finite automaton than as an unambiguous CFG.
