Table of Contents
Fetching ...

ExoPredicator: Learning Abstract Models of Dynamic Worlds for Robot Planning

Yichao Liang, Dat Nguyen, Cambridge Yang, Tianyang Li, Joshua B. Tenenbaum, Carl Edward Rasmussen, Adrian Weller, Zenna Tavares, Tom Silver, Kevin Ellis

TL;DR

ExoPredicator addresses long-horizon robot planning in environments with concurrent exogenous dynamics by learning abstract world models that combine symbolic state predicates with causal processes. It jointly learns (i) predicates that abstract observations into a compact state representation and (ii) the timecourse of both endogenous actions and exogenous mechanisms, using variational inference and LLM-guided proposals. A big-step planner operates over these abstractions, enabling efficient lookahead that accounts for delayed effects. Across five simulated tabletop domains, the approach generalizes to unseen tasks with more objects and complex goals and outperforms baselines, demonstrating the value of combining symbolic predicates, learned temporal dynamics, and foundation-model guidance for robust, sample-efficient planning. This framework broadly advances planning under uncertainty by integrating learning, causality, and symbolic reasoning into a single, scalable pipeline.

Abstract

Long-horizon embodied planning is challenging because the world does not only change through an agent's actions: exogenous processes (e.g., water heating, dominoes cascading) unfold concurrently with the agent's actions. We propose a framework for abstract world models that jointly learns (i) symbolic state representations and (ii) causal processes for both endogenous actions and exogenous mechanisms. Each causal process models the time course of a stochastic cause-effect relation. We learn these world models from limited data via variational Bayesian inference combined with LLM proposals. Across five simulated tabletop robotics environments, the learned models enable fast planning that generalizes to held-out tasks with more objects and more complex goals, outperforming a range of baselines.

ExoPredicator: Learning Abstract Models of Dynamic Worlds for Robot Planning

TL;DR

ExoPredicator addresses long-horizon robot planning in environments with concurrent exogenous dynamics by learning abstract world models that combine symbolic state predicates with causal processes. It jointly learns (i) predicates that abstract observations into a compact state representation and (ii) the timecourse of both endogenous actions and exogenous mechanisms, using variational inference and LLM-guided proposals. A big-step planner operates over these abstractions, enabling efficient lookahead that accounts for delayed effects. Across five simulated tabletop domains, the approach generalizes to unseen tasks with more objects and complex goals and outperforms baselines, demonstrating the value of combining symbolic predicates, learned temporal dynamics, and foundation-model guidance for robust, sample-efficient planning. This framework broadly advances planning under uncertainty by integrating learning, causality, and symbolic reasoning into a single, scalable pipeline.

Abstract

Long-horizon embodied planning is challenging because the world does not only change through an agent's actions: exogenous processes (e.g., water heating, dominoes cascading) unfold concurrently with the agent's actions. We propose a framework for abstract world models that jointly learns (i) symbolic state representations and (ii) causal processes for both endogenous actions and exogenous mechanisms. Each causal process models the time course of a stochastic cause-effect relation. We learn these world models from limited data via variational Bayesian inference combined with LLM proposals. Across five simulated tabletop robotics environments, the learned models enable fast planning that generalizes to held-out tasks with more objects and more complex goals, outperforming a range of baselines.

Paper Structure

This paper contains 63 sections, 12 equations, 6 figures.

Figures (6)

  • Figure 1: Dynamic environments include both endogenous processes (actions under the agent's direct control, such as Switch On Faucet) and exogenous processes (e.g., Jug Filling with Water) that evolve on their own. Planning requires reasoning about both kinds of processes.
  • Figure 2: Raw input maps to a state abstraction via predicates: short Python programs detecting binary features. Learning the state abstraction means synthesizing these programs. Temporal dynamics of abstract states are governed by causal processes: either endogenous processes (actions), or exogenous processes in the outside world. Causes realize their effects only after a delay, and can be interleaved. Learning causal processes allows planning by breaking frame-by-frame dynamics into discrete jumps between abstract states. (Illustration simplified; see text.)
  • Figure 3: The online learning loop, where the agent repeatedly uses its current model to plan and interact with the world, then refines that model by learning new predicates and causal processes from the experience. The figure shows an example where the agent's initial model in iteration $i$ leads to a failed plan (Water Spilled!). After observing this failure and updating its knowledge ("Diff Learned Model"), the agent creates a successful plan in iteration $i+1$ ("Diff plan after learning").
  • Figure 4: Environments. Top row: train task examples. Bottom row: evaluation task examples.
  • Figure 5: Successful ExoPredicator trajectories in the Domino (top) and Fan (bottom) environments. The code highlights the key learned exogenous processes, describing how dominoes cascade and how the fan's wind moves the ball. These processes incorporate predicates invented by the agent, like NOT-IsImmovable and FanFaces, which enable efficient and effective planning.
  • ...and 1 more figures