On the Optimality of Discrete Object Naming: a Kinship Case Study
Phong Le, Mees Lindeman, Raquel G. Alhama
TL;DR
This work addresses how discrete object naming can achieve an optimal trade-off between informativeness and complexity. It introduces an information-theoretic framework with information loss $L$ and complexity $C$, proves that optimality is attained only when the Listener's decoder equals the Speaker's Bayesian decoder, and applies this to kinship naming in both human data and emergent neural speakers. Through a need-agnostic reformulation, it enables cross-linguistic comparison independent of communicative need distributions, and demonstrates that neural emergent systems can approach the theoretical frontier while maintaining tractable complexity. The findings highlight a robust link between Bayesian decoding and efficiency in discrete naming, with practical implications for modeling language evolution and designing efficient emergent communication systems.
Abstract
The structure of naming systems in natural languages hinges on a trade-off between high informativeness and low complexity. Prior work capitalizes on information theory to formalize these notions; however, these studies generally rely on two simplifications: (i) optimal listeners, and (ii) universal communicative need across languages. Here, we address these limitations by introducing an information-theoretic framework for discrete object naming systems, and we use it to prove that an optimal trade-off is achievable if and only if the listener's decoder is equivalent to the Bayesian decoder of the speaker. Adopting a referential game setup from emergent communication, and focusing on the semantic domain of kinship, we show that our notion of optimality is not only theoretically achievable but also emerges empirically in learned communication systems.
