Table of Contents
Fetching ...

CORBA: Contagious Recursive Blocking Attacks on Multi-Agent Systems Based on Large Language Models

Zhenhong Zhou, Zherui Li, Jie Zhang, Yuanhe Zhang, Kun Wang, Yang Liu, Qing Guo

TL;DR

<3-5 sentence high-level summary> The paper identifies blocking attacks as a looming threat to LLM-based multi-agent systems (LLM-MASs) and introduces Corba, a contagious recursive blocking attack that both sustains blocking within individual agents and propagates it across network topology. It formalizes Blocking Attacks (B), Contagious Attacks (C), and their combination into Corba, then empirically demonstrates Corba's superiority over baseline prompt-injection attacks across AutoGen, Camel, and open-ended MASs on diverse topologies and LLM backbones. The results show Corba can significantly reduce system availability (high P-ASR) and do so quickly (low PTN), across both open-source frameworks and complex topologies. The work highlights critical security vulnerabilities in current LLM-MAS deployments and motivates the development of robust defenses to ensure reliability in real-world use.

Abstract

Large Language Model-based Multi-Agent Systems (LLM-MASs) have demonstrated remarkable real-world capabilities, effectively collaborating to complete complex tasks. While these systems are designed with safety mechanisms, such as rejecting harmful instructions through alignment, their security remains largely unexplored. This gap leaves LLM-MASs vulnerable to targeted disruptions. In this paper, we introduce Contagious Recursive Blocking Attacks (Corba), a novel and simple yet highly effective attack that disrupts interactions between agents within an LLM-MAS. Corba leverages two key properties: its contagious nature allows it to propagate across arbitrary network topologies, while its recursive property enables sustained depletion of computational resources. Notably, these blocking attacks often involve seemingly benign instructions, making them particularly challenging to mitigate using conventional alignment methods. We evaluate Corba on two widely-used LLM-MASs, namely, AutoGen and Camel across various topologies and commercial models. Additionally, we conduct more extensive experiments in open-ended interactive LLM-MASs, demonstrating the effectiveness of Corba in complex topology structures and open-source models. Our code is available at: https://github.com/zhrli324/Corba.

CORBA: Contagious Recursive Blocking Attacks on Multi-Agent Systems Based on Large Language Models

TL;DR

<3-5 sentence high-level summary> The paper identifies blocking attacks as a looming threat to LLM-based multi-agent systems (LLM-MASs) and introduces Corba, a contagious recursive blocking attack that both sustains blocking within individual agents and propagates it across network topology. It formalizes Blocking Attacks (B), Contagious Attacks (C), and their combination into Corba, then empirically demonstrates Corba's superiority over baseline prompt-injection attacks across AutoGen, Camel, and open-ended MASs on diverse topologies and LLM backbones. The results show Corba can significantly reduce system availability (high P-ASR) and do so quickly (low PTN), across both open-source frameworks and complex topologies. The work highlights critical security vulnerabilities in current LLM-MAS deployments and motivates the development of robust defenses to ensure reliability in real-world use.

Abstract

Large Language Model-based Multi-Agent Systems (LLM-MASs) have demonstrated remarkable real-world capabilities, effectively collaborating to complete complex tasks. While these systems are designed with safety mechanisms, such as rejecting harmful instructions through alignment, their security remains largely unexplored. This gap leaves LLM-MASs vulnerable to targeted disruptions. In this paper, we introduce Contagious Recursive Blocking Attacks (Corba), a novel and simple yet highly effective attack that disrupts interactions between agents within an LLM-MAS. Corba leverages two key properties: its contagious nature allows it to propagate across arbitrary network topologies, while its recursive property enables sustained depletion of computational resources. Notably, these blocking attacks often involve seemingly benign instructions, making them particularly challenging to mitigate using conventional alignment methods. We evaluate Corba on two widely-used LLM-MASs, namely, AutoGen and Camel across various topologies and commercial models. Additionally, we conduct more extensive experiments in open-ended interactive LLM-MASs, demonstrating the effectiveness of Corba in complex topology structures and open-source models. Our code is available at: https://github.com/zhrli324/Corba.

Paper Structure

This paper contains 18 sections, 7 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: P-ASR (%) on Open-ended LLM-MASs with various LLMs. An Open-ended LLM-MAS with six agents in free dialogue was evaluated at specific turns. Results show that Corba outperforms baselines, compromising most agents within a few turns.
  • Figure 2: Direct LLM evaluation.
  • Figure 3: LLM Checker for several Attacks.
  • Figure 4: LLM-MAS monitor evaluation.
  • Figure 5: Agent Monitor for several Attacks.
  • ...and 1 more figures