Term Coding for Extremal Combinatorics: Dispersion and Complexity Dichotomies
Søren Riis
TL;DR
Term Coding provides a quantifier-free, negation-free framework to encode extremal combinatorial problems as finite term-equation systems, optionally with non-equality constraints. The authors relate the maximum solution size $\max_{\mathcal{I}}(\Gamma,n)$ to graph guessing numbers and entropy, and extend the framework to multi-sorted settings, enabling designs, finite geometries, and mixed coding scenarios to be analyzed uniformly. A main result is a finite-bounds theorem tying code size to a guessing-number of an associated graph; dispersion is studied as a natural subclass with a striking complexity dichotomy in the single-sorted case: deciding if the maximum can reach $n^r$ is undecidable, while deciding if it eventually exceeds $n^r+1$ is polynomial-time decidable. The work also generalises to FO finite satisfiability via multi-sorted Term Coding, showing undecidability in the general setting and establishing a robust toolkit (normalisation, diversification, guessing-number bounds) for extremal and existence questions. Altogether, Term Coding unifies and extends extremal combinatorics, finite model theory, and network coding, revealing deep connections between algebraic representations and information-theoretic limits with potential implications for combinatorial design and algorithmic reasoning.
Abstract
We introduce \emph{Term Coding}, a novel framework for analysing extremal problems in discrete mathematics by encoding them as finite systems of \emph{term equations} (and, optionally, \emph{non-equality constraints}). In its basic form, all variables range over a single domain, and we seek an interpretation of the function symbols that \emph{maximises} the number of solutions to these constraints. This perspective unifies classical questions in extremal combinatorics, network/index coding, and finite model theory. We further develop \emph{multi-sorted Term Coding}, a more general approach in which variables may be of different sorts (e.g., points, lines, blocks, colours, labels), possibly supplemented by variable-inequality constraints to enforce distinctness. This extension captures sophisticated structures such as block designs, finite geometries, and mixed coding scenarios within a single logical formalism. Our main result shows how to determine (up to a constant) the maximum number of solutions \(\max_{\mathcal{I}}(Γ,n)\) for any system of term equations (possibly including non-equality constraints) by relating it to \emph{graph guessing numbers} and \emph{entropy measures}. Finally, we focus on \emph{dispersion problems}, an expressive subclass of these constraints. We discover a striking complexity dichotomy: deciding whether, for a given integer \(r\), the maximum code size that reaches \(n^{r}\) is \emph{undecidable}, while deciding whether it exceeds \(n^{r}\) is \emph{polynomial-time decidable}.
