Unsupervised Translation of Emergent Communication
Ido Levy, Orr Paradise, Boaz Carmeli, Ron Meir, Shafi Goldwasser, Yonatan Belinkov
TL;DR
This paper tackles translating emergent communication (EC) developed by AI agents in referential games into natural language without parallel data. It proposes a two-phase approach: extract an EC corpus from agent interactions and apply unsupervised neural machine translation (UNMT) with an English linguistic prior from image captions, leveraging back-translation and denoising. The study systematically analyzes how task complexity, defined by semantic diversity of environments (Random, Category, Supercategory, Inter-category), affects EC translatability, introducing a benchmark of EC-specific translation metrics. The findings show UNMT can translate EC to English, with translatability varying by complexity and revealing a nuanced relationship between compositionality, entropy, and translation quality, offering a path toward interpretable AI emergent languages.
Abstract
Emergent Communication (EC) provides a unique window into the language systems that emerge autonomously when agents are trained to jointly achieve shared goals. However, it is difficult to interpret EC and evaluate its relationship with natural languages (NL). This study employs unsupervised neural machine translation (UNMT) techniques to decipher ECs formed during referential games with varying task complexities, influenced by the semantic diversity of the environment. Our findings demonstrate UNMT's potential to translate EC, illustrating that task complexity characterized by semantic diversity enhances EC translatability, while higher task complexity with constrained semantic variability exhibits pragmatic EC, which, although challenging to interpret, remains suitable for translation. This research marks the first attempt, to our knowledge, to translate EC without the aid of parallel data.
