Prismatic World Model: Learning Compositional Dynamics for Planning in Hybrid Systems

Mingwei Li; Xiaoyuan Zhang; Chengwei Yang; Zilong Zheng; Yaodong Yang

Prismatic World Model: Learning Compositional Dynamics for Planning in Hybrid Systems

Mingwei Li, Xiaoyuan Zhang, Chengwei Yang, Zilong Zheng, Yaodong Yang

TL;DR

PRISM-WM tackles the failure modes of monolithic latent dynamics in hybrid systems by decomposing transitions into composable base dynamics via a context-aware Mixture-of-Experts with latent orthogonalization. The gating mechanism implicitly identifies physical regimes while specialized experts model regime-specific transitions, reducing long-horizon rollout drift and mitigating mode interference. The approach functions as a drop-in enhancement for both online planning (TD-MPC) and direct policy learning (PWM), improving planning fidelity and gradient stability. Experimental results across DiffRL, MT30, and Humanoid benchmarks demonstrate superior sample efficiency, better generalization, and robust long-horizon performance, highlighting PRISM-WM as a strong foundational component for next-generation model-based agents.

Abstract

Model-based planning in robotic domains is fundamentally challenged by the hybrid nature of physical dynamics, where continuous motion is punctuated by discrete events such as contacts and impacts. Conventional latent world models typically employ monolithic neural networks that enforce global continuity, inevitably over-smoothing the distinct dynamic modes (e.g., sticking vs. sliding, flight vs. stance). For a planner, this smoothing results in catastrophic compounding errors during long-horizon lookaheads, rendering the search process unreliable at physical boundaries. To address this, we introduce the Prismatic World Model (PRISM-WM), a structured architecture designed to decompose complex hybrid dynamics into composable primitives. PRISM-WM leverages a context-aware Mixture-of-Experts (MoE) framework where a gating mechanism implicitly identifies the current physical mode, and specialized experts predict the associated transition dynamics. We further introduce a latent orthogonalization objective to ensure expert diversity, effectively preventing mode collapse. By accurately modeling the sharp mode transitions in system dynamics, PRISM-WM significantly reduces rollout drift. Extensive experiments on challenging continuous control benchmarks, including high-dimensional humanoids and diverse multi-task settings, demonstrate that PRISM-WM provides a superior high-fidelity substrate for trajectory optimization algorithms (e.g., TD-MPC), proving its potential as a powerful foundational model for next-generation model-based agents.

Prismatic World Model: Learning Compositional Dynamics for Planning in Hybrid Systems

TL;DR

Abstract

Prismatic World Model: Learning Compositional Dynamics for Planning in Hybrid Systems

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (10)