NetSafe: Exploring the Topological Safety of Multi-agent Networks
Miao Yu, Shilong Wang, Guibin Zhang, Junyuan Mao, Chenlong Yin, Qijiong Liu, Qingsong Wen, Kun Wang, Yang Wang
TL;DR
The paper tackles the problem of safeguarding LLM-based multi-agent networks by exploring how network topology affects vulnerability to misinformation, bias, and harmful information. It introduces NetSafe, a framework with RelCom to standardize iterative inter-agent communication and enable rigorous topological safety analysis. Through static and dynamic evaluations across multiple topologies and attack types, it shows that highly connected graphs are more susceptible to misinformation, while aggregation and bias-harmful content demonstrate stronger safety under certain conditions, with APV offering better real-world alignment. It also reveals universal phenomena such as Agent Hallucination and Aggregation Safety, providing actionable insights for designing safer multi-agent systems.
Abstract
Large language models (LLMs) have empowered nodes within multi-agent networks with intelligence, showing growing applications in both academia and industry. However, how to prevent these networks from generating malicious information remains unexplored with previous research on single LLM's safety be challenging to transfer. In this paper, we focus on the safety of multi-agent networks from a topological perspective, investigating which topological properties contribute to safer networks. To this end, we propose a general framework, NetSafe along with an iterative RelCom interaction to unify existing diverse LLM-based agent frameworks, laying the foundation for generalized topological safety research. We identify several critical phenomena when multi-agent networks are exposed to attacks involving misinformation, bias, and harmful information, termed as Agent Hallucination and Aggregation Safety. Furthermore, we find that highly connected networks are more susceptible to the spread of adversarial attacks, with task performance in a Star Graph Topology decreasing by 29.7%. Besides, our proposed static metrics aligned more closely with real-world dynamic evaluations than traditional graph-theoretic metrics, indicating that networks with greater average distances from attackers exhibit enhanced safety. In conclusion, our work introduces a new topological perspective on the safety of LLM-based multi-agent networks and discovers several unreported phenomena, paving the way for future research to explore the safety of such networks.
