Table of Contents
Fetching ...

Progent: Programmable Privilege Control for LLM Agents

Tianneng Shi, Jingxuan He, Zhun Wang, Hongwei Li, Linyu Wu, Wenbo Guo, Dawn Song

TL;DR

<3-5 sentence high-level summary> Progent introduces a runtime, programmable privilege framework that enforces fine-grained, tool-level access controls for LLM agents. It provides a domain-specific language for policy specification, supports deterministic enforcement, and enables dynamic policy updates and flexible fallbacks to preserve agent utility. The approach is non-intrusive, wrapping existing tools with JSON Schema-based policies, and can be augmented by LLM-generated policies for per-query security. Extensive evaluations across AgentDojo, ASB, and EHRAgent/AgentPoison demonstrate zero attack success rates under diverse threats while maintaining high task effectiveness, highlighting practical security improvements for real-world LLM agents.

Abstract

LLM agents utilize Large Language Models as central components with diverse tools to complete various user tasks, but face significant security risks when interacting with external environments. Attackers can exploit these agents through various vectors, including indirect prompt injection, memory/knowledge base poisoning, and malicious tools, tricking agents into performing dangerous actions such as unauthorized financial transactions or data leakage. The core problem that enables attacks to succeed lies in over-privileged tool access. We introduce Progent, the first privilege control framework to secure LLM agents. Progent enforces security at the tool level by restricting agents to performing tool calls necessary for user tasks while blocking potentially malicious ones. Progent features a domain-specific language that allows for expressing fine-grained policies for controlling tool privileges, flexible fallback actions when calls are blocked, and dynamic policy updates to adapt to changing agent states. The framework operates deterministically at runtime, providing provable security guarantees. Thanks to our modular design, integrating Progent does not alter agent internals and only requires minimal changes to the existing agent implementation, enhancing its practicality and potential for widespread adoption. Our extensive evaluation across various agent use cases, using benchmarks like AgentDojo, ASB, and AgentPoison, demonstrates that Progent reduces attack success rates to 0%, while preserving agent utility and speed. Additionally, we show that LLMs can automatically generate effective policies, highlighting their potential for automating the process of writing Progent's security policies.

Progent: Programmable Privilege Control for LLM Agents

TL;DR

<3-5 sentence high-level summary> Progent introduces a runtime, programmable privilege framework that enforces fine-grained, tool-level access controls for LLM agents. It provides a domain-specific language for policy specification, supports deterministic enforcement, and enables dynamic policy updates and flexible fallbacks to preserve agent utility. The approach is non-intrusive, wrapping existing tools with JSON Schema-based policies, and can be augmented by LLM-generated policies for per-query security. Extensive evaluations across AgentDojo, ASB, and EHRAgent/AgentPoison demonstrate zero attack success rates under diverse threats while maintaining high task effectiveness, highlighting practical security improvements for real-world LLM agents.

Abstract

LLM agents utilize Large Language Models as central components with diverse tools to complete various user tasks, but face significant security risks when interacting with external environments. Attackers can exploit these agents through various vectors, including indirect prompt injection, memory/knowledge base poisoning, and malicious tools, tricking agents into performing dangerous actions such as unauthorized financial transactions or data leakage. The core problem that enables attacks to succeed lies in over-privileged tool access. We introduce Progent, the first privilege control framework to secure LLM agents. Progent enforces security at the tool level by restricting agents to performing tool calls necessary for user tasks while blocking potentially malicious ones. Progent features a domain-specific language that allows for expressing fine-grained policies for controlling tool privileges, flexible fallback actions when calls are blocked, and dynamic policy updates to adapt to changing agent states. The framework operates deterministically at runtime, providing provable security guarantees. Thanks to our modular design, integrating Progent does not alter agent internals and only requires minimal changes to the existing agent implementation, enhancing its practicality and potential for widespread adoption. Our extensive evaluation across various agent use cases, using benchmarks like AgentDojo, ASB, and AgentPoison, demonstrates that Progent reduces attack success rates to 0%, while preserving agent utility and speed. Additionally, we show that LLMs can automatically generate effective policies, highlighting their potential for automating the process of writing Progent's security policies.

Paper Structure

This paper contains 65 sections, 18 figures, 5 tables, 4 algorithms.

Figures (18)

  • Figure 1: Left: a realistic attack invariant_github_mcp_vul exploiting coding agents to exfiltrate sensitive data about private GitHub repositories. Right top: Progent's overall design as a proxy to enforce privilege control over agents' tool calls. Right bottom: Progent's precise and fine-grained security policies to prevent data leakage while maintaining agent utility.
  • Figure 2: An example of a workspace agent that performs competitive analysis. Progent prevents unauthorized email sending by dynamically updating the policy set after the agent reads sensitive information.
  • Figure 3: A formal definition of tools in LLM agents.
  • Figure 4: Progent's domain-specific language for defining privilege control policies over agent tool calls.
  • Figure 5: Comparison between vanilla agent (no defense), prior defenses, and Progent on AgentDojo debenedetti2024agentdojo.
  • ...and 13 more figures