Table of Contents
Fetching ...

Learning Approximate and Exact Numeral Systems via Reinforcement Learning

Emil Carlsson, Devdatt Dubhashi, Fredrik D. Johansson

TL;DR

The paper addresses how efficient numeral systems can emerge through learning by having two agents engage in a reinforcement-learning driven Lewis signaling game to convey numeral concepts. The approach yields exact and approximate numeral systems that are near-optimal under information-theoretic costs and resemble human systems of comparable complexity, providing a mechanistic learning-based explanation for prior findings. The results demonstrate that non-recursive numeral systems can be learned and aligned with Gaussian Weber-model representations, suggesting broad applicability to other semantic domains. The work offers a principled framework linking reward structures to communicative efficiency and points to future directions such as larger ranges, approximate arithmetic, recursion, and pragmatic reasoning enhancements.

Abstract

Recent work (Xu et al., 2020) has suggested that numeral systems in different languages are shaped by a functional need for efficient communication in an information-theoretic sense. Here we take a learning-theoretic approach and show how efficient communication emerges via reinforcement learning. In our framework, two artificial agents play a Lewis signaling game where the goal is to convey a numeral concept. The agents gradually learn to communicate using reinforcement learning and the resulting numeral systems are shown to be efficient in the information-theoretic framework of Regier et al. (2015); Gibson et al. (2017). They are also shown to be similar to human numeral systems of same type. Our results thus provide a mechanistic explanation via reinforcement learning of the recent results in Xu et al. (2020) and can potentially be generalized to other semantic domains.

Learning Approximate and Exact Numeral Systems via Reinforcement Learning

TL;DR

The paper addresses how efficient numeral systems can emerge through learning by having two agents engage in a reinforcement-learning driven Lewis signaling game to convey numeral concepts. The approach yields exact and approximate numeral systems that are near-optimal under information-theoretic costs and resemble human systems of comparable complexity, providing a mechanistic learning-based explanation for prior findings. The results demonstrate that non-recursive numeral systems can be learned and aligned with Gaussian Weber-model representations, suggesting broad applicability to other semantic domains. The work offers a principled framework linking reward structures to communicative efficiency and points to future directions such as larger ranges, approximate arithmetic, recursion, and pragmatic reasoning enhancements.

Abstract

Recent work (Xu et al., 2020) has suggested that numeral systems in different languages are shaped by a functional need for efficient communication in an information-theoretic sense. Here we take a learning-theoretic approach and show how efficient communication emerges via reinforcement learning. In our framework, two artificial agents play a Lewis signaling game where the goal is to convey a numeral concept. The agents gradually learn to communicate using reinforcement learning and the resulting numeral systems are shown to be efficient in the information-theoretic framework of Regier et al. (2015); Gibson et al. (2017). They are also shown to be similar to human numeral systems of same type. Our results thus provide a mechanistic explanation via reinforcement learning of the recent results in Xu et al. (2020) and can potentially be generalized to other semantic domains.

Paper Structure

This paper contains 9 sections, 10 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Illustration of the communication setup presented in Xu2020. The sender wants to convey the numeral concept $4$ and utters "a few". The listener is unsure of which numeral the sender is referring to and produces a probability distribution over possible numerals.
  • Figure 2: Illustration of one round of our Lewis signaling game, which will be formally introduced in later sections. The sender is given a number $n$ and samples a model $f_S$ from $F_S$ using dropout and conveys the word $w$ giving highest reward according to $f_S$. The listener proceeds in similar fashion, given $w$ it samples a model $f_L$ from $F_L$ and guesses the number $\hat{n}$ that yields most reward according to $f_L$. A shared reward is given to both agent based on how close $\hat{n}$ is to $n$.
  • Figure 3: Term usage vs communication cost. Note that our agents are not restricted to model the words as Gaussian distributions and can create other probability distributions. This explains why the line goes below the convex hull, for $2$ terms, which was computed assuming Gaussian distributions. We plot the numeral systems from the human languages presented Table \ref{['tab:numeral_systems']} and since many of them are very similar we only get a few distinct points for human languages in the figure.
  • Figure 4: Comparison between the optimal numeral systems w.r.t communication cost, human systems and the artificial consensus systems produced by our agents under the different reward functions. We considered the experiments using the power-law prior and the optimal systems are computed under this prior. Each color represents a numeral word and the corresponding interval on the number line that the word represents.
  • Figure 5: a) The need probabilities, or priors, used. b) Relative frequency of term uses over sender-listener pairs using the linear reward function and varying the need probability. The more left-skewed the need probability is, the fewer terms are generally used by the agents.