From Thinker to Society: Security in Hierarchical Autonomy Evolution of AI Agents

Xiaolei Zhang; Lu Zhou; Xiaogang Xu; Jiafei Wu; Tianyu Du; Heqing Huang; Hao Peng; Zhe Liu

From Thinker to Society: Security in Hierarchical Autonomy Evolution of AI Agents

Xiaolei Zhang, Lu Zhou, Xiaogang Xu, Jiafei Wu, Tianyu Du, Heqing Huang, Hao Peng, Zhe Liu

TL;DR

This work presents a taxonomy of threats spanning cognitive manipulation, physical environment disruption, and multi-agent systemic failures, and aims to guide the development of multilayered, autonomy-aware defense architectures for trustworthy AI agent systems.

Abstract

Artificial Intelligence (AI) agents have evolved from passive predictive tools into active entities capable of autonomous decision-making and environmental interaction, driven by the reasoning capabilities of Large Language Models (LLMs). However, this evolution has introduced critical security vulnerabilities that existing frameworks fail to address. The Hierarchical Autonomy Evolution (HAE) framework organizes agent security into three tiers: Cognitive Autonomy (L1) targets internal reasoning integrity; Execution Autonomy (L2) covers tool-mediated environmental interaction; Collective Autonomy (L3) addresses systemic risks in multi-agent ecosystems. We present a taxonomy of threats spanning cognitive manipulation, physical environment disruption, and multi-agent systemic failures, and evaluate existing defenses while identifying key research gaps. The findings aim to guide the development of multilayered, autonomy-aware defense architectures for trustworthy AI agent systems.

From Thinker to Society: Security in Hierarchical Autonomy Evolution of AI Agents

TL;DR

Abstract

Paper Structure (38 sections, 4 figures, 1 table)

This paper contains 38 sections, 4 figures, 1 table.

Introduction
Evolution of AI Agents
Anatomy of AI Agent
The HAE Framework Definition
L1 - Thinker (Cognitive Autonomy)
L2 - Doer (Executional Autonomy)
L3 - Society (Collective Autonomy)
Root Cause Analysis for Agent Security
Cognitive Autonomy — The Thinker
Autonomy Evolution in Cognitive Reasoning
Security Threats in Cognitive Autonomy
Indirect Prompt Injection
Cognitive Hijacking
Memory Corruption
Defense Mechanisms for Cognitive Integrity
...and 23 more sections

Figures (4)

Figure 2: Agent architecture showing perception, brain, memory, and action modules with security risks.
Figure 3: L1 Cognitive Autonomy Architecture and Threat Landscape. This depicts the internal cognitive loop of an intelligent agent as a thinker, encompassing perception, reasoning, and memory retrieval processes. Its security boundaries are primarily constrained by attacks targeting cognitive integrity, such as command hijacking, cognitive hijacking, and memory poisoning.
Figure 4: L2 Executional Autonomy Architecture and Threat Landscape. This figure demonstrates how agents function as executors that engage in substantive interactions with external digital and physical environments through tool interfaces, thereby introducing emerging threats with real-world kinetic consequences including confused deputy, tool abuse, environmental damage, and unsafe action chains.
Figure 5: L3 Collective Autonomy Architecture and Threat Landscape. Manager-Worker hierarchical structure where L3 agents achieve decentralized collaboration via A2A communication protocols and capability evolution. These coordination mechanisms open channels for three categories of systemic risk: malicious collusion, viral infection, and systemic collapse.

From Thinker to Society: Security in Hierarchical Autonomy Evolution of AI Agents

TL;DR

Abstract

From Thinker to Society: Security in Hierarchical Autonomy Evolution of AI Agents

Authors

TL;DR

Abstract

Table of Contents

Figures (4)