Foundation World Models for Agents that Learn, Verify, and Adapt Reliably Beyond Static Environments

Florent Delgrange

Foundation World Models for Agents that Learn, Verify, and Adapt Reliably Beyond Static Environments

Florent Delgrange

TL;DR

A vision for foundation world models: persistent, compositional representations that unify reinforcement learning, reactive/program synthesis, and abstraction mechanisms that enable agents to synthesize verifiable programs, derive new policies from a small number of interactions, and maintain correctness while adapting to novelty is outlined.

Abstract

The next generation of autonomous agents must not only learn efficiently but also act reliably and adapt their behavior in open worlds. Standard approaches typically assume fixed tasks and environments with little or no novelty, which limits world models' ability to support agents that must evolve their policies as conditions change. This paper outlines a vision for foundation world models: persistent, compositional representations that unify reinforcement learning, reactive/program synthesis, and abstraction mechanisms. We propose an agenda built around four components: (i) learnable reward models from specifications to support optimization with clear objectives; (ii) adaptive formal verification integrated throughout learning; (iii) online abstraction calibration to quantify the reliability of the model's predictions; and (iv) test-time synthesis and world-model generation guided by verifiers. Together, these components enable agents to synthesize verifiable programs, derive new policies from a small number of interactions, and maintain correctness while adapting to novelty. The resulting framework positions foundation world models as a substrate for learning, reasoning, and adaptation, laying the groundwork for agents that not only act well but can explain and justify the behavior they adopt.

Foundation World Models for Agents that Learn, Verify, and Adapt Reliably Beyond Static Environments

TL;DR

Abstract

Paper Structure (9 sections, 1 figure)

This paper contains 9 sections, 1 figure.

Introduction
The vision: learning verifiable world models.
Learning Verifiable World Models
Learning from Formalized Rewards
Verification During Learning
Abstraction and World-Model Calibration
Foundation World Models
LLMs as Specification Refiners
Conclusion

Figures (1)

Figure 1: Given a specification$\varphi$ formalizing the intended agent's behavior, a reward model is automatically converted/generated from $\varphi$. Simultaneously with optimizing the return, the agent learns a representation of the observations through the state space of a verifiable world model. Consequently, the policy is learned directly on this representation. To provide the guarantees, the world model is processed through a verifier (e.g., a model-checker). The verifier's output guides the agent or corrects the policy as needed. At any time, the verifier can return a certificate on both specification satisfaction and the abstraction quality of the world model. Note that we do not assume to have access to the explicit environment dynamics (simulating it is sufficient).

Foundation World Models for Agents that Learn, Verify, and Adapt Reliably Beyond Static Environments

TL;DR

Abstract

Foundation World Models for Agents that Learn, Verify, and Adapt Reliably Beyond Static Environments

Authors

TL;DR

Abstract

Table of Contents

Figures (1)