Red-Teaming LLM Multi-Agent Systems via Communication Attacks

Pengfei He; Yupin Lin; Shen Dong; Han Xu; Yue Xing; Hui Liu

Red-Teaming LLM Multi-Agent Systems via Communication Attacks

Pengfei He, Yupin Lin, Shen Dong, Han Xu, Yue Xing, Hui Liu

TL;DR

The paper identifies a systemic flaw in LLM-based multi-agent systems: inter-agent communications can be intercepted and manipulated to derail collaborative tasks. It introduces AiTM, an external adversarial agent that uses message interception and a reflection-based prompting loop to craft contextually tailored instructions that influence victim agents and propagate malicious behavior through the MAS. Extensive experiments across AutoGen and Camel, diverse communication structures, and multiple real-world datasets demonstrate consistent vulnerability, with high attack success rates and notable degradation in real-world frameworks like MetaGPT and ChatDev. These findings highlight the urgent need for defenses targeting the communication layer of LLM-MAS to ensure secure, robust collaboration.

Abstract

Large Language Model-based Multi-Agent Systems (LLM-MAS) have revolutionized complex problem-solving capability by enabling sophisticated agent collaboration through message-based communications. While the communication framework is crucial for agent coordination, it also introduces a critical yet unexplored security vulnerability. In this work, we introduce Agent-in-the-Middle (AiTM), a novel attack that exploits the fundamental communication mechanisms in LLM-MAS by intercepting and manipulating inter-agent messages. Unlike existing attacks that compromise individual agents, AiTM demonstrates how an adversary can compromise entire multi-agent systems by only manipulating the messages passing between agents. To enable the attack under the challenges of limited control and role-restricted communication format, we develop an LLM-powered adversarial agent with a reflection mechanism that generates contextually-aware malicious instructions. Our comprehensive evaluation across various frameworks, communication structures, and real-world applications demonstrates that LLM-MAS is vulnerable to communication-based attacks, highlighting the need for robust security measures in multi-agent systems.

Red-Teaming LLM Multi-Agent Systems via Communication Attacks

TL;DR

Abstract

Red-Teaming LLM Multi-Agent Systems via Communication Attacks

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)