Table of Contents
Fetching ...

AI Agents Should be Regulated Based on the Extent of Their Autonomous Operations

Takayuki Osogami

TL;DR

The paper tackles the problem that existential risks from AI agents emerge from autonomous, long-horizon reasoning and planning, which are not adequately captured by training-compute metrics alone. It argues for regulating AI agents by the extent of their autonomous operations, formalizing this through action sequences and action graphs to bound risk in open, poorly observed environments. The authors introduce a baseline framework featuring strongly acceptable action-sequence sets, similarity-based acceptability, and concrete implementation/enforcement pathways, while acknowledging that this is not a complete solution and requires further research. This approach aims to slow risky capability growth, provide a robust, observable safety lever, and complement existing compute-based regulatory discussions, with policy context spanning EU regulations and US executive initiatives.

Abstract

This position paper argues that AI agents should be regulated by the extent to which they operate autonomously. AI agents with long-term planning and strategic capabilities can pose significant risks of human extinction and irreversible global catastrophes. While existing regulations often focus on computational scale as a proxy for potential harm, we argue that such measures are insufficient for assessing the risks posed by agents whose capabilities arise primarily from inference-time computation. To support our position, we discuss relevant regulations and recommendations from scientists regarding existential risks, as well as the advantages of using action sequences -- which reflect the degree of an agent's autonomy -- as a more suitable measure of potential impact than existing metrics that rely on observing environmental states.

AI Agents Should be Regulated Based on the Extent of Their Autonomous Operations

TL;DR

The paper tackles the problem that existential risks from AI agents emerge from autonomous, long-horizon reasoning and planning, which are not adequately captured by training-compute metrics alone. It argues for regulating AI agents by the extent of their autonomous operations, formalizing this through action sequences and action graphs to bound risk in open, poorly observed environments. The authors introduce a baseline framework featuring strongly acceptable action-sequence sets, similarity-based acceptability, and concrete implementation/enforcement pathways, while acknowledging that this is not a complete solution and requires further research. This approach aims to slow risky capability growth, provide a robust, observable safety lever, and complement existing compute-based regulatory discussions, with policy context spanning EU regulations and US executive initiatives.

Abstract

This position paper argues that AI agents should be regulated by the extent to which they operate autonomously. AI agents with long-term planning and strategic capabilities can pose significant risks of human extinction and irreversible global catastrophes. While existing regulations often focus on computational scale as a proxy for potential harm, we argue that such measures are insufficient for assessing the risks posed by agents whose capabilities arise primarily from inference-time computation. To support our position, we discuss relevant regulations and recommendations from scientists regarding existential risks, as well as the advantages of using action sequences -- which reflect the degree of an agent's autonomy -- as a more suitable measure of potential impact than existing metrics that rely on observing environmental states.

Paper Structure

This paper contains 18 sections, 2 figures.

Figures (2)

  • Figure 1: AI agent trained with a configuration generates actions
  • Figure 2: Safety of actions in a nearly unobservable Markov decision process