Robust Multi-agent Communication via Multi-view Message Certification
Lei Yuan, Tao Jiang, Lihe Li, Feng Chen, Zongzhang Zhang, Yang Yu
TL;DR
<p>This paper tackles robustness in cooperative MARL by addressing vulnerabilities in inter-agent communication. It introduces CroMAC, a framework that treats messages as multiple views of the state, fusing them with a multi-view variational autoencoder (MVAE) that uses a product-of-experts to form a joint message representation. It then derives certificates between the joint representation and individual messages through interval bound propagation to bound Q-values under worst-case perturbations, and trains the system with a robustness objective under a centralized training, decentralized execution paradigm. Empirical results on Hallway, Level-Based Foraging, Traffic Junction, and SMAC maps show CroMAC achieves, and often surpasses, existing baselines under various perturbation regimes, demonstrating strong robustness and generality across MARL settings. The work advances practical, certifiable robustness for multi-agent communication, with implications for deployment in real-world, noisy environments.</p>
Abstract
Many multi-agent scenarios require message sharing among agents to promote coordination, hastening the robustness of multi-agent communication when policies are deployed in a message perturbation environment. Major relevant works tackle this issue under specific assumptions, like a limited number of message channels would sustain perturbations, limiting the efficiency in complex scenarios. In this paper, we take a further step addressing this issue by learning a robust multi-agent communication policy via multi-view message certification, dubbed CroMAC. Agents trained under CroMAC can obtain guaranteed lower bounds on state-action values to identify and choose the optimal action under a worst-case deviation when the received messages are perturbed. Concretely, we first model multi-agent communication as a multi-view problem, where every message stands for a view of the state. Then we extract a certificated joint message representation by a multi-view variational autoencoder (MVAE) that uses a product-of-experts inference network. For the optimization phase, we do perturbations in the latent space of the state for a certificate guarantee. Then the learned joint message representation is used to approximate the certificated state representation during training. Extensive experiments in several cooperative multi-agent benchmarks validate the effectiveness of the proposed CroMAC.
