Progent: Programmable Privilege Control for LLM Agents
Tianneng Shi, Jingxuan He, Zhun Wang, Hongwei Li, Linyu Wu, Wenbo Guo, Dawn Song
TL;DR
<3-5 sentence high-level summary> Progent introduces a runtime, programmable privilege framework that enforces fine-grained, tool-level access controls for LLM agents. It provides a domain-specific language for policy specification, supports deterministic enforcement, and enables dynamic policy updates and flexible fallbacks to preserve agent utility. The approach is non-intrusive, wrapping existing tools with JSON Schema-based policies, and can be augmented by LLM-generated policies for per-query security. Extensive evaluations across AgentDojo, ASB, and EHRAgent/AgentPoison demonstrate zero attack success rates under diverse threats while maintaining high task effectiveness, highlighting practical security improvements for real-world LLM agents.
Abstract
LLM agents utilize Large Language Models as central components with diverse tools to complete various user tasks, but face significant security risks when interacting with external environments. Attackers can exploit these agents through various vectors, including indirect prompt injection, memory/knowledge base poisoning, and malicious tools, tricking agents into performing dangerous actions such as unauthorized financial transactions or data leakage. The core problem that enables attacks to succeed lies in over-privileged tool access. We introduce Progent, the first privilege control framework to secure LLM agents. Progent enforces security at the tool level by restricting agents to performing tool calls necessary for user tasks while blocking potentially malicious ones. Progent features a domain-specific language that allows for expressing fine-grained policies for controlling tool privileges, flexible fallback actions when calls are blocked, and dynamic policy updates to adapt to changing agent states. The framework operates deterministically at runtime, providing provable security guarantees. Thanks to our modular design, integrating Progent does not alter agent internals and only requires minimal changes to the existing agent implementation, enhancing its practicality and potential for widespread adoption. Our extensive evaluation across various agent use cases, using benchmarks like AgentDojo, ASB, and AgentPoison, demonstrates that Progent reduces attack success rates to 0%, while preserving agent utility and speed. Additionally, we show that LLMs can automatically generate effective policies, highlighting their potential for automating the process of writing Progent's security policies.
