Table of Contents
Fetching ...

Visual Theory of Mind Enables the Invention of Proto-Writing

Benjamin A. Spiegel, Lucas Gelfond, George Konidaris

TL;DR

This work investigates how proto-writing could emerge by combining naturalistic, multi-agent reinforcement learning with inferential communication grounded in visual theory of mind. It introduces Signification Games to study the invention and evolution of pictographs, revealing a signification gap that limits purely behaviorist signaling. A two-part model shows that incorporating a Bayesian-like inferential mechanism and a referent-sensitivity term $P_{ref}$ enables communication of visually complex referents with crude pictographs and explains the observed shift from iconic to abstract signs over time. The results suggest that visual theory of mind is a crucial cognitive mechanism enabling the invention of the first writing-like signs and establish Signification Games as a general tool for exploring communication across media and species.

Abstract

Symbolic writing systems are graphical semiotic codes that are ubiquitous in modern society but are otherwise absent in the animal kingdom. Anthropological evidence suggests that the earliest forms of some writing systems originally consisted of iconic pictographs, which signify their referent via visual resemblance. While previous studies have examined the emergence and, separately, the evolution of pictographic systems through a computational lens, most employ non-naturalistic methodologies that make it difficult to draw clear analogies to human and animal cognition. We develop a multi-agent reinforcement learning testbed for emergent communication called a Signification Game, and formulate a model of inferential communication that enables agents to leverage visual theory of mind to communicate actions using pictographs. Our model, which is situated within a broader formalism for animal communication, sheds light on the cognitive and cultural processes underlying the emergence of proto-writing.

Visual Theory of Mind Enables the Invention of Proto-Writing

TL;DR

This work investigates how proto-writing could emerge by combining naturalistic, multi-agent reinforcement learning with inferential communication grounded in visual theory of mind. It introduces Signification Games to study the invention and evolution of pictographs, revealing a signification gap that limits purely behaviorist signaling. A two-part model shows that incorporating a Bayesian-like inferential mechanism and a referent-sensitivity term enables communication of visually complex referents with crude pictographs and explains the observed shift from iconic to abstract signs over time. The results suggest that visual theory of mind is a crucial cognitive mechanism enabling the invention of the first writing-like signs and establish Signification Games as a general tool for exploring communication across media and species.

Abstract

Symbolic writing systems are graphical semiotic codes that are ubiquitous in modern society but are otherwise absent in the animal kingdom. Anthropological evidence suggests that the earliest forms of some writing systems originally consisted of iconic pictographs, which signify their referent via visual resemblance. While previous studies have examined the emergence and, separately, the evolution of pictographic systems through a computational lens, most employ non-naturalistic methodologies that make it difficult to draw clear analogies to human and animal cognition. We develop a multi-agent reinforcement learning testbed for emergent communication called a Signification Game, and formulate a model of inferential communication that enables agents to leverage visual theory of mind to communicate actions using pictographs. Our model, which is situated within a broader formalism for animal communication, sheds light on the cognitive and cultural processes underlying the emergence of proto-writing.

Paper Structure

This paper contains 18 sections, 1 equation, 4 figures.

Figures (4)

  • Figure 1: A) Agents in a Signification Game signaling for manipulative (red agent) or cooperative (green agent) ends. Actions yielding positive and negative rewards are denoted with green and red bubbles, respectively. Agents perceive signals and environment observations with the same networks. B) Signs invented by agents to mimic classes from the Cifar100 dataset. Agents create pictographs given discrete states that serve as class labels. C) We simulate communication over long time horizons, finding that initially iconic signs grow more abstract over time. We compare the evolution of a sign for Cifar100 palm trees for a Chinese character for "tree."
  • Figure 2: Visualization of Signaling Over Time. Agent signals first appear random (top rows) and grow increasingly ordered over time. In a manipulation setting (left), senders learn to generate convincing replicas of state observations, tricking increasingly picky listeners. In a manipulation-cooperation setting (second from left), agents begin cooperating after 300 epochs of manipulation, resulting in a drift from iconic signals into abstract symbols. In a fully-cooperative setting (second from right) agents generate vaguely iconic signals that quickly shift into symbols. Environmental pressures on signal production and interpretation---simulated by size and curve penalties---greatly affect the appearance of signals over time, as evidenced by the final rows of signs. Qualitative analysis of the iconic nature of signals is supported by an iconicity probe (right, average of 10 runs plotted with a 95% c.i.). We find that signals achieve low probe entropy (i.e. higher confidence of recognition) shortly after communication begins before increasing dramatically, reflecting a transition from icons to abstract symbols.
  • Figure 3: Senders have difficulty generating convincing environment observations for most referents. The distance between the space of drawings and the decision boundary for a referent represents a signification gap for that referent.
  • Figure 4: Comparison of Behaviorist and Inferential Signaling. Agents wielding pictographic signification quickly succeed at communicating by generating iconic stimuli, achieving high rates of communicative success after only 300 epochs (second from right). As in Part I, we find that environmental pressures affect the appearance of pictographs over time (second group from left). We also report one agent's estimate of $P_{ref}(r_i)$ for five referents over time (right). Some $P_{ref}$ values for a small number of referents are volatile (e.g. dolphins and rockets) before drifting back toward the mean. We hypothesize this is due to the mastering of some referents before others, likely from inherent difficulties in creating iconic signs for some referents. Analysis of Cifar100 finds that there is greater visual variance in the classes for which $P_{ref}$ is more volatile (e.g. rockets are sometimes obscured by plumes of exhaust smoke).