Learning Efficient Recursive Numeral Systems via Reinforcement Learning
Andrea Silvi, Jonathan Thomas, Emil Carlsson, Devdatt Dubhashi, Moa Johansson
TL;DR
The paper addresses how recursive numeral systems can emerge under pressures for efficient communication. It introduces a neuro-symbolic two-agent RL framework built on a slightly modified Hurford meta-grammar to allow grammar changes and optimization. The main contributions show that RL-guided interaction yields numeral systems that lie near the Pareto frontier of lexicon size and morphosyntactic complexity, with configurations bearing resemblance to human systems. This work provides a mechanistic explanation for the emergence of efficient recursive numeral systems and points to future work on iterated learning and distributional effects to further align with human languages.
Abstract
It has previously been shown that by using reinforcement learning (RL), agents can derive simple approximate and exact-restricted numeral systems that are similar to human ones (Carlsson, 2021). However, it is a major challenge to show how more complex recursive numeral systems, similar to for example English, could arise via a simple learning mechanism such as RL. Here, we introduce an approach towards deriving a mechanistic explanation of the emergence of efficient recursive number systems. We consider pairs of agents learning how to communicate about numerical quantities through a meta-grammar that can be gradually modified throughout the interactions. Utilising a slightly modified version of the meta-grammar of Hurford (1975), we demonstrate that our RL agents, shaped by the pressures for efficient communication, can effectively modify their lexicon towards Pareto-optimal configurations which are comparable to those observed within human numeral systems in terms of their efficiency.
