What Capable Agents Must Know: Selection Theorems for Robust Decision-Making under Uncertainty

Aran Nayebi

What Capable Agents Must Know: Selection Theorems for Robust Decision-Making under Uncertainty

Aran Nayebi

Abstract

As artificial agents become increasingly capable, what internal structure is *necessary* for an agent to act competently under uncertainty? Classical results show that optimal control can be *implemented* using belief states or world models, but not that such representations are required. We prove quantitative "selection theorems" showing that low *average-case regret* on structured families of action-conditioned prediction tasks forces an agent to implement a predictive, structured internal state. Our results cover stochastic policies, partial observability, and evaluation under task distributions, without assuming optimality, determinism, or access to an explicit model. Technically, we reduce predictive modeling to binary "betting" decisions and show that regret bounds limit probability mass on suboptimal bets, enforcing the predictive distinctions needed to separate high-margin outcomes. In fully observed settings, this yields approximate recovery of the interventional transition kernel; under partial observability, it implies necessity of belief-like memory and predictive state, addressing an open question in prior world-model recovery work.

What Capable Agents Must Know: Selection Theorems for Robust Decision-Making under Uncertainty

Abstract

Paper Structure (20 sections, 10 theorems, 109 equations)

This paper contains 20 sections, 10 theorems, 109 equations.

Introduction
Related Work
Notation and Constants
World model recovery in fully observed environments
Selection Theorems under Partial Observability
Setup and Notation (Partial Observability)
World Models and Memory (Predictive-State View)
Predictive modeling necessity under partial observability
Memory necessity
Structured task families: modularity, tradeoffs, and representational match
Convention (vanishing regret).
Discussion
Proof of Lemma \ref{['lem:bet']}
Proof of Theorem \ref{['thm:fo_avg_stoch']}
Proof of Theorem \ref{['thm:predictive']}
...and 5 more sections

Key Result

Lemma 1

Define the wrong-action mass Then the normalized regret $\delta$ is equivalent to: In the special betting case where $u_L$ and $u_R$ are complementary, namely $u_R:=1-u_L$, defining the margin $m:=|u_L-\tfrac{1}{2}|$, we obtain Consequently, on the event $m\ge\gamma\in(0,\tfrac{1}{2}]$,

Theorems & Definitions (22)

Definition 1: Composite goal family $G^{(n)}_{s,a,s',k}$
Lemma 1: Binary-decision regret controls wrong-action mass
Theorem 1: Fully observed: stochastic policies + average regret $\Rightarrow$ approximate transition model
Remark 1: Independence from goal family size
Corollary 1: Causal content: approximately recovered interventional kernel
proof
Corollary 2: No generic Level 3 recovery from the interventional kernel
proof
Theorem 2: Predictive modeling necessity
Theorem 3: Memory necessity
...and 12 more

What Capable Agents Must Know: Selection Theorems for Robust Decision-Making under Uncertainty

Abstract

What Capable Agents Must Know: Selection Theorems for Robust Decision-Making under Uncertainty

Authors

Abstract

Table of Contents

Key Result

Theorems & Definitions (22)