Firewalls to Secure Dynamic LLM Agentic Networks
Sahar Abdelnabi, Amr Gomaa, Eugene Bagdasarian, Per Ola Kristensson, Reza Shokri
TL;DR
The paper addresses securing dynamic, multi-turn LLM-based agentic networks where agents interact with users and other agents to achieve long-horizon goals. It introduces a three-layer firewall framework—input, data, and trajectory—whose policies are derived from prior conversations and enforced via a task-specific, quarantined language to prevent prompt injections while maintaining adaptability. Experimental travel-planning scenarios show substantial privacy and security gains (privacy leakage ≈2% vs ~70% in naive settings; calendar-deletion attacks ≈0% vs ~45%), with little to no loss in utility. The work offers a practical blueprint for building secure, autonomous agentic networks and makes its code publicly available for further research and deployment refinement.
Abstract
LLM agents will likely communicate on behalf of users with other entity-representing agents on tasks involving long-horizon plans with interdependent goals. Current work neglects these agentic networks and their challenges. We identify required properties for agent communication: proactivity, adaptability, privacy (sharing only task-necessary information), and security (preserving integrity and utility against selfish entities). After demonstrating communication vulnerabilities, we propose a practical design and protocol inspired by network security principles. Our framework automatically derives task-specific rules from prior conversations to build firewalls. These firewalls construct a closed language that is completely controlled by the developer. They transform any personal data to the allowed degree of permissibility entailed by the task. Both operations are completely quarantined from external attackers, disabling the potential for prompt injections, jailbreaks, or manipulation. By incorporating rules learned from their previous mistakes, agents rewrite their instructions and self-correct during communication. Evaluations on diverse attacks demonstrate our framework significantly reduces privacy and security vulnerabilities while allowing adaptability.
