Table of Contents
Fetching ...

Term Coding for Extremal Combinatorics: Dispersion and Complexity Dichotomies

Søren Riis

TL;DR

Term Coding provides a quantifier-free, negation-free framework to encode extremal combinatorial problems as finite term-equation systems, optionally with non-equality constraints. The authors relate the maximum solution size $\max_{\mathcal{I}}(\Gamma,n)$ to graph guessing numbers and entropy, and extend the framework to multi-sorted settings, enabling designs, finite geometries, and mixed coding scenarios to be analyzed uniformly. A main result is a finite-bounds theorem tying code size to a guessing-number of an associated graph; dispersion is studied as a natural subclass with a striking complexity dichotomy in the single-sorted case: deciding if the maximum can reach $n^r$ is undecidable, while deciding if it eventually exceeds $n^r+1$ is polynomial-time decidable. The work also generalises to FO finite satisfiability via multi-sorted Term Coding, showing undecidability in the general setting and establishing a robust toolkit (normalisation, diversification, guessing-number bounds) for extremal and existence questions. Altogether, Term Coding unifies and extends extremal combinatorics, finite model theory, and network coding, revealing deep connections between algebraic representations and information-theoretic limits with potential implications for combinatorial design and algorithmic reasoning.

Abstract

We introduce \emph{Term Coding}, a novel framework for analysing extremal problems in discrete mathematics by encoding them as finite systems of \emph{term equations} (and, optionally, \emph{non-equality constraints}). In its basic form, all variables range over a single domain, and we seek an interpretation of the function symbols that \emph{maximises} the number of solutions to these constraints. This perspective unifies classical questions in extremal combinatorics, network/index coding, and finite model theory. We further develop \emph{multi-sorted Term Coding}, a more general approach in which variables may be of different sorts (e.g., points, lines, blocks, colours, labels), possibly supplemented by variable-inequality constraints to enforce distinctness. This extension captures sophisticated structures such as block designs, finite geometries, and mixed coding scenarios within a single logical formalism. Our main result shows how to determine (up to a constant) the maximum number of solutions \(\max_{\mathcal{I}}(Γ,n)\) for any system of term equations (possibly including non-equality constraints) by relating it to \emph{graph guessing numbers} and \emph{entropy measures}. Finally, we focus on \emph{dispersion problems}, an expressive subclass of these constraints. We discover a striking complexity dichotomy: deciding whether, for a given integer \(r\), the maximum code size that reaches \(n^{r}\) is \emph{undecidable}, while deciding whether it exceeds \(n^{r}\) is \emph{polynomial-time decidable}.

Term Coding for Extremal Combinatorics: Dispersion and Complexity Dichotomies

TL;DR

Term Coding provides a quantifier-free, negation-free framework to encode extremal combinatorial problems as finite term-equation systems, optionally with non-equality constraints. The authors relate the maximum solution size to graph guessing numbers and entropy, and extend the framework to multi-sorted settings, enabling designs, finite geometries, and mixed coding scenarios to be analyzed uniformly. A main result is a finite-bounds theorem tying code size to a guessing-number of an associated graph; dispersion is studied as a natural subclass with a striking complexity dichotomy in the single-sorted case: deciding if the maximum can reach is undecidable, while deciding if it eventually exceeds is polynomial-time decidable. The work also generalises to FO finite satisfiability via multi-sorted Term Coding, showing undecidability in the general setting and establishing a robust toolkit (normalisation, diversification, guessing-number bounds) for extremal and existence questions. Altogether, Term Coding unifies and extends extremal combinatorics, finite model theory, and network coding, revealing deep connections between algebraic representations and information-theoretic limits with potential implications for combinatorial design and algorithmic reasoning.

Abstract

We introduce \emph{Term Coding}, a novel framework for analysing extremal problems in discrete mathematics by encoding them as finite systems of \emph{term equations} (and, optionally, \emph{non-equality constraints}). In its basic form, all variables range over a single domain, and we seek an interpretation of the function symbols that \emph{maximises} the number of solutions to these constraints. This perspective unifies classical questions in extremal combinatorics, network/index coding, and finite model theory. We further develop \emph{multi-sorted Term Coding}, a more general approach in which variables may be of different sorts (e.g., points, lines, blocks, colours, labels), possibly supplemented by variable-inequality constraints to enforce distinctness. This extension captures sophisticated structures such as block designs, finite geometries, and mixed coding scenarios within a single logical formalism. Our main result shows how to determine (up to a constant) the maximum number of solutions \(\max_{\mathcal{I}}(Γ,n)\) for any system of term equations (possibly including non-equality constraints) by relating it to \emph{graph guessing numbers} and \emph{entropy measures}. Finally, we focus on \emph{dispersion problems}, an expressive subclass of these constraints. We discover a striking complexity dichotomy: deciding whether, for a given integer , the maximum code size that reaches is \emph{undecidable}, while deciding whether it exceeds is \emph{polynomial-time decidable}.

Paper Structure

This paper contains 61 sections, 16 theorems, 115 equations, 3 figures, 3 tables.

Key Result

Proposition 2.3

Let $\langle \Gamma, \Delta \rangle$ be any system (single‑ or multi‑sorted). Let $\langle \Gamma', \Delta' \rangle$ be obtained by flattening every compound subterm and by replacing each disequality $s\neq t$ with the atomic $x_i\neq x_j$ that references the newly introduced variables (Definition d

Figures (3)

  • Figure 1: Directed graph showing the functional dependencies for the normalised and diversified term equations derived from the unsolvable self-decoding Latin square variant (introduced in Section \ref{['sec:variant']}). The graph has six nodes, labelled according to the variables $\{x,x,y,y,z,w\}$ on the right-hand side of these six equations. An edge $u \to v$ indicates that variable $v$ depends functionally on variable $u$ in the corresponding equation. The next section analyses this dependency structure using guessing number/entropy techniques.
  • Figure 2: Variable dependency graph $G_\Gamma$ for the normalised system derived from $f(x,f(x,y))=y$ and $f(f(x,y),y)=x$, where $z=f(x,y)$.
  • Figure 3: Variable dependency graph $G_\Gamma$: the core bidirected cycle $C_5$ on $\{x,y,z,\alpha,\beta\}$. To keep the figure visually simple, we omit the auxiliary outputs (e.g., $\gamma=f(x,y)$ and $\delta=f(y,x)$) and the disequality constraint $\gamma\neq\delta$.

Theorems & Definitions (47)

  • Remark 1.1: Normalisation choices
  • Definition 2.1: Normalised disequality
  • Definition 2.2: Normalised Equation
  • Proposition 2.3: Normalisation preserves solutions for pairs
  • proof
  • Remark 2.1: Connection to Diversification
  • Proposition 2.4
  • proof
  • Definition 3.1: Variable Dependency Graph
  • Definition 3.2: Interpretation
  • ...and 37 more