Shadows in the Code: Exploring the Risks and Defenses of LLM-based Multi-Agent Software Development Systems
Xiaoqing Wang, Keman Huang, Bin Liang, Hongyu Li, Xiaoyong Du
TL;DR
This work identifies two security risk scenarios—Malicious User with Benign Agents (MU-BA) and Benign User with Malicious Agents (BU-MA)—in LLM-based multi-agent software development systems and introduces the Implicit Malicious Behavior Injection Attack (IMBIA) along with its defense Adv-IMBIA. By formalizing the attack and defense, the authors evaluate across three representative frameworks (ChatDev, MetaGPT, AgentVerse), revealing high attack success rates and framework/phase-dependent vulnerability patterns; coding and testing phases emerge as the most vulnerable. The defense results show Adv-IMBIA can significantly reduce ASR, particularly in MU-BA, though BU-MA defenses are more challenging, suggesting architecture-aware, targeted defenses are essential for practical security. The study provides actionable guidelines for resource-efficient defense deployment, highlights the need for security across multi-agent SE pipelines, and offers a foundation for safer LLM-driven software development ecosystems. Overall, the work advances understanding of adversarial risks in multi-agent code generation and proposes concrete methods to detect and mitigate covert malicious behavior in collaborative software development.
Abstract
The rapid advancement of Large Language Model (LLM)-driven multi-agent systems has significantly streamlined software developing tasks, enabling users with little technical expertise to develop executable applications. While these systems democratize software creation through natural language requirements, they introduce significant security risks that remain largely unexplored. We identify two risky scenarios: Malicious User with Benign Agents (MU-BA) and Benign User with Malicious Agents (BU-MA). We introduce the Implicit Malicious Behavior Injection Attack (IMBIA), demonstrating how multi-agent systems can be manipulated to generate software with concealed malicious capabilities beneath seemingly benign applications, and propose Adv-IMBIA as a defense mechanism. Evaluations across ChatDev, MetaGPT, and AgentVerse frameworks reveal varying vulnerability patterns, with IMBIA achieving attack success rates of 93%, 45%, and 71% in MU-BA scenarios, and 71%, 84%, and 45% in BU-MA scenarios. Our defense mechanism reduced attack success rates significantly, particularly in the MU-BA scenario. Further analysis reveals that compromised agents in the coding and testing phases pose significantly greater security risks, while also identifying critical agents that require protection against malicious user exploitation. Our findings highlight the urgent need for robust security measures in multi-agent software development systems and provide practical guidelines for implementing targeted, resource-efficient defensive strategies.
