DMWM: Dual-Mind World Model with Long-Term Imagination
Lingyi Wang, Rashed Shelim, Walid Saad, Naren Ramakrishnan
TL;DR
This work addresses the challenge of reliable long-horizon imagination in world models by introducing DMWM, a Dual-Mind World Model that combines a fast RSSM-based System 1 (RSSM-S1) with a logic-integrated System 2 (LINN-S2). LINN-S2 enforces logical consistency over extended horizons via hierarchical deep reasoning and logic regularizers, while an inter-system feedback loop couples the two components and refines domain-specific logics through a logic-augmented ELBO. Empirical results on DMControl and robotic benchmarks show that DMWM yields superior logical consistency, data and trial efficiency, and robust long-term imagination compared to RSSM-based baselines and gradient-based MPC, including significant gains under constrained data and horizon sizes. The approach promises improved robustness and interpretability for model-based RL and MPC, with potential as a step toward more general, logic-aware world models, though it relies on predefined logical rules that could be learned end-to-end in future work.
Abstract
Imagination in world models is crucial for enabling agents to learn long-horizon policy in a sample-efficient manner. Existing recurrent state-space model (RSSM)-based world models depend on single-step statistical inference to capture the environment dynamics, and, hence, they are unable to perform long-term imagination tasks due to the accumulation of prediction errors. Inspired by the dual-process theory of human cognition, we propose a novel dual-mind world model (DMWM) framework that integrates logical reasoning to enable imagination with logical consistency. DMWM is composed of two components: an RSSM-based System 1 (RSSM-S1) component that handles state transitions in an intuitive manner and a logic-integrated neural network-based System 2 (LINN-S2) component that guides the imagination process through hierarchical deep logical reasoning. The inter-system feedback mechanism is designed to ensure that the imagination process follows the logical rules of the real environment. The proposed framework is evaluated on benchmark tasks that require long-term planning from the DMControl suite. Extensive experimental results demonstrate that the proposed framework yields significant improvements in terms of logical coherence, trial efficiency, data efficiency and long-term imagination over the state-of-the-art world models.
