Why Agents Compromise Safety Under Pressure

Hengle Jiang; Ke Tang

Why Agents Compromise Safety Under Pressure

Hengle Jiang, Ke Tang

Abstract

Large Language Model agents deployed in complex environments frequently encounter a conflict between maximizing goal achievement and adhering to safety constraints. This paper identifies a new concept called Agentic Pressure, which characterizes the endogenous tension emerging when compliant execution becomes infeasible. We demonstrate that under this pressure agents exhibit normative drift where they strategically sacrifice safety to preserve utility. Notably we find that advanced reasoning capabilities accelerate this decline as models construct linguistic rationalizations to justify violation. Finally, we analyze the root causes and explore preliminary mitigation strategies, such as pressure isolation, which attempts to restore alignment by decoupling decision-making from pressure signals.

Why Agents Compromise Safety Under Pressure

Abstract

Paper Structure (62 sections, 3 equations, 6 figures, 6 tables)

This paper contains 62 sections, 3 equations, 6 figures, 6 tables.

Introduction
Related Work
Safety in LLMs
Benchmarks for Agents
Safety in Autonomous Agents
Agentic Pressure
From LLM Pressure to Agentic Pressure
Taxonomy of Pressure Sources
Type I: Resource Scarcity
Type II: Environmental Friction
Type III: Social Inducement
The Cognitive Shift: From Reasoning to Rationalization
Preliminary Analysis
Experimental Setup
Stress-Testing Variants
...and 47 more sections

Figures (6)

Figure 1: The "Good Agent" Paradox: While the user's request is non-malicious, the combination of high urgency and resource deadlock forces the agent to trade off safety for goal achievement.
Figure 2: Taxonomy of Pressure Sources
Figure 3: Preliminary results on TravelPlanner under non-adversarial pressure
Figure 4: Overview of the Agentic Pressure Evaluation Framework.
Figure 5: Normative Drift Distribution. The scatter plot shows individual episode outcomes, highlighting the primary shift from the Ideal Region (Safety, Utility) to the Drift Region (Low Safety, Higher Utility) under agentic pressure.
...and 1 more figures

Why Agents Compromise Safety Under Pressure

Abstract

Why Agents Compromise Safety Under Pressure

Authors

Abstract

Table of Contents

Figures (6)