Table of Contents
Fetching ...

General agents contain world models

Jonathan Richens, David Abel, Alexis Bellot, Tom Everitt

TL;DR

It is shown that any agent capable of generalizing to multi-step goal-directed tasks must have learned a predictive model of its environment, and that increasing the agents performance or the complexity of the goals it can achieve requires learning increasingly accurate world models.

Abstract

Are world models a necessary ingredient for flexible, goal-directed behaviour, or is model-free learning sufficient? We provide a formal answer to this question, showing that any agent capable of generalizing to multi-step goal-directed tasks must have learned a predictive model of its environment. We show that this model can be extracted from the agent's policy, and that increasing the agents performance or the complexity of the goals it can achieve requires learning increasingly accurate world models. This has a number of consequences: from developing safe and general agents, to bounding agent capabilities in complex environments, and providing new algorithms for eliciting world models from agents.

General agents contain world models

TL;DR

It is shown that any agent capable of generalizing to multi-step goal-directed tasks must have learned a predictive model of its environment, and that increasing the agents performance or the complexity of the goals it can achieve requires learning increasingly accurate world models.

Abstract

Are world models a necessary ingredient for flexible, goal-directed behaviour, or is model-free learning sufficient? We provide a formal answer to this question, showing that any agent capable of generalizing to multi-step goal-directed tasks must have learned a predictive model of its environment. We show that this model can be extracted from the agent's policy, and that increasing the agents performance or the complexity of the goals it can achieve requires learning increasingly accurate world models. This has a number of consequences: from developing safe and general agents, to bounding agent capabilities in complex environments, and providing new algorithms for eliciting world models from agents.

Paper Structure

This paper contains 22 sections, 10 theorems, 49 equations, 4 figures, 1 table, 2 algorithms.

Key Result

Theorem 1

Let $P_{ss'}(a) = P(S_{t+1}\!=\!s'\! \mid\! A_t\!=\!a, S_t\!=\!s)$ be the transition probabilities of an environment satisfying assumption: environment. Let $\pi$ be a goal-conditioned agent (def: bounded agent) with a maximum failure rate $\delta$ for all goals $\psi \in \bm \Psi_n$ where $\bm \Psi for any $n,\delta$, and for $\delta \ll 1$, $n \gg 1$ the error scales as, Proof in appendix: main

Figures (4)

  • Figure 1: Our result complements previous insights from planning and inverse RL. While planning uses a world model and a goal to determine a policy, and IRL and inverse planning use an agent's policy and a world model to identify its goal, our result uses an agent's policy and its goal to identify a world model
  • Figure 2: The agent-environment system. Agents are maps from states $s_t$ (or histories) and goals $\psi$ to actions $a_t$. The dashed line represents \ref{['alg:estimate_p']}, which recovers the environment transition probabilities from this agent map.
  • Figure 3: a) shows the mean error in the world model recovered by \ref{['alg:estimate_p_simple']}, $\langle \epsilon \rangle$, decreases as the agent learns to generalize to higher depth goals. $N_\text{max}(\langle \delta \rangle = 0.04)$ is the maximum goal depth such that the agent achieves a mean regret $\leq 0.04$. The scaling is $\mathcal{O}(n^{-1/2})$, as with the scaling between the worst-case error $\epsilon$ and worst-case regret $\delta$ in \ref{['theorem: main']}. b) shows the mean error scaling with $\langle \delta (n = 50)\rangle$, the mean regret the agent achieves for depth $n = 50$ goals. For both figures, error bars show 95% confidence intervals for the mean over 10 experiments where we re-trained the agents with different experience trajectories of the same length.
  • Figure 4: Figure illustrates the composite goal in the proof of \ref{['theorem: main']}.

Theorems & Definitions (29)

  • Definition 1: Controlled Markov process
  • Definition 2: Goals
  • Definition 3: Composite goals
  • Definition 4: optimal goal-conditioned agent
  • Definition 5: bounded goal-conditioned agent
  • Theorem 1
  • Theorem 2
  • Definition 5: Controlled Markov process
  • Definition 5: Goals
  • Definition 6: Sequential goals
  • ...and 19 more