Implicit Repair with Reinforcement Learning in Emergent Communication
Fábio Vital, Alberto Sardinha, Francisco S. Melo
TL;DR
This work addresses robust emergent communication under environmental noise by introducing a Noisy Lewis Game (NLG) that injects channel and input noise into a referential signaling task. Both Speaker and Listener are trained as reinforcement-learning agents, with the Listener able to operate without direct feedback to the Speaker, leading to an implicit repair dynamic where redundancy in messages preserves task success. Empirical results show that RL-based listeners outperform supervised baselines and that NLG-trained protocols exhibit strong robustness to noise and generalize across larger candidate sets and unseen perturbations, outperforming deterministic-channel variants especially under noise. These findings highlight redundancy as a practical mechanism for implicit repair in emergent communication, with implications for building robust multi-agent systems and informing future work on combining implicit and explicit repair strategies. The work also opens avenues toward universal repair-language systems that can adapt to new tasks and unseen environments.
Abstract
Conversational repair is a mechanism used to detect and resolve miscommunication and misinformation problems when two or more agents interact. One particular and underexplored form of repair in emergent communication is the implicit repair mechanism, where the interlocutor purposely conveys the desired information in such a way as to prevent misinformation from any other interlocutor. This work explores how redundancy can modify the emergent communication protocol to continue conveying the necessary information to complete the underlying task, even with additional external environmental pressures such as noise. We focus on extending the signaling game, called the Lewis Game, by adding noise in the communication channel and inputs received by the agents. Our analysis shows that agents add redundancy to the transmitted messages as an outcome to prevent the negative impact of noise on the task success. Additionally, we observe that the emerging communication protocol's generalization capabilities remain equivalent to architectures employed in simpler games that are entirely deterministic. Additionally, our method is the only one suitable for producing robust communication protocols that can handle cases with and without noise while maintaining increased generalization performance levels.
