Local generation of languages
Mathieu Hoyrup
TL;DR
This work develops a topological framework for generating languages with local, one-shot rules. By encoding input–output dependencies as a communication complex $K_f$ over positions, the authors relate language generation to simplicial topology, irreducibility, and symmetry properties, and they identify minimal complexes (graphs or higher-dimensional) that suffice to generate various languages. They provide complete characterizations for several natural binary languages (e.g., even/odd parity, non-decreasing, non-constant, upwards-closed), and establish partial results and conjectures for more complex cases (e.g., unique or multiple occurrences of a symbol, identical consecutive letters). The connection to distributed computing via combinatorial topology clarifies how local rules can realize global specifications and highlights both the power and limitations of graph-based vs higher-dimensional communication structures. The work also outlines rich directions for future research, including a deeper understanding of irreducible components, symmetry breaking, and the computational complexity of deciding generator existence, with potential extensions to broader topological methods.
Abstract
Given a language, which in this article is a set of strings of some fixed length, we study the problem of producing its elements by a procedure in which each position has its own local rule. We introduce a way of measuring how much communication is needed between positions. The communication structure is captured by a simplicial complex whose vertices are the positions and the simplices are the communication channels between positions. The main problem is then to identify the simplicial complexes that can be used to generate a given language. We develop the theory and apply it to a number of languages.
