Robust agents learn causal world models

Jonathan Richens; Tom Everitt

Robust agents learn causal world models

Jonathan Richens, Tom Everitt

TL;DR

This paper addresses whether learning causal models is necessary for robust generalization under distributional shifts. It proves a converse: any agent that achieves a regret bound $\delta$ across a broad class of shifts must have learned an approximate causal model of the data-generating process, with exact recovery for optimal policies. The authors formalize causal models via causal Bayesian networks and local interventions, show that domain adaptation constraints impose causal discovery requirements, and demonstrate causal discovery by observing regret-bounded agents in synthetic environments. They discuss implications for transfer learning, causal inference, and emergent world models, arguing that causal representations naturally emerge from multi-domain objectives and are crucial for robustness.

Abstract

It has long been hypothesised that causal reasoning plays a fundamental role in robust and general intelligence. However, it is not known if agents must learn causal models in order to generalise to new domains, or if other inductive biases are sufficient. We answer this question, showing that any agent capable of satisfying a regret bound under a large set of distributional shifts must have learned an approximate causal model of the data generating process, which converges to the true causal model for optimal agents. We discuss the implications of this result for several research areas including transfer learning and causal inference.

Robust agents learn causal world models

TL;DR

This paper addresses whether learning causal models is necessary for robust generalization under distributional shifts. It proves a converse: any agent that achieves a regret bound

across a broad class of shifts must have learned an approximate causal model of the data-generating process, with exact recovery for optimal policies. The authors formalize causal models via causal Bayesian networks and local interventions, show that domain adaptation constraints impose causal discovery requirements, and demonstrate causal discovery by observing regret-bounded agents in synthetic environments. They discuss implications for transfer learning, causal inference, and emergent world models, arguing that causal representations naturally emerge from multi-domain objectives and are crucial for robustness.

Abstract

Paper Structure (5 sections, 2 equations, 1 figure)

This paper contains 5 sections, 2 equations, 1 figure.

Introduction
Outline of paper.
Preliminaries
Causal models
Decision tasks

Figures (1)

Figure :

Theorems & Definitions (4)

Definition 1: Bayesian networks
Definition 2: Local interventions
Definition 3: Mixtures of interventions
Definition 4: Causal influence diagram

Robust agents learn causal world models

TL;DR

Abstract

Robust agents learn causal world models

Authors

TL;DR

Abstract

Table of Contents

Figures (1)

Theorems & Definitions (4)