Emergence of Linguistic Communication from Referential Games with Symbolic and Pixel Input
Angeliki Lazaridou, Karl Moritz Hermann, Karl Tuyls, Stephen Clark
TL;DR
The paper demonstrates that cooperative reinforcement-learning agents can develop referential communication protocols from both symbolic, disentangled inputs and raw pixel data. Structured, attribute-based representations support more robust, compositional language, including generalization to novel objects and topographic alignment between meanings and signals. When trained on raw pixel input, agents still communicate above chance but produce more ad-hoc, less interpretable languages, with compositional structure contingent on the disentanglement of underlying factors of variation. Overall, the work scales emergent communication research to realistic perception, showing the critical role of environmental structure in shaping language emergence and grounding.
Abstract
The ability of algorithms to evolve or learn (compositional) communication protocols has traditionally been studied in the language evolution literature through the use of emergent communication tasks. Here we scale up this research by using contemporary deep learning methods and by training reinforcement-learning neural network agents on referential communication games. We extend previous work, in which agents were trained in symbolic environments, by developing agents which are able to learn from raw pixel data, a more challenging and realistic input representation. We find that the degree of structure found in the input data affects the nature of the emerged protocols, and thereby corroborate the hypothesis that structured compositional language is most likely to emerge when agents perceive the world as being structured.
