Table of Contents
Fetching ...

DMWM: Dual-Mind World Model with Long-Term Imagination

Lingyi Wang, Rashed Shelim, Walid Saad, Naren Ramakrishnan

TL;DR

This work addresses the challenge of reliable long-horizon imagination in world models by introducing DMWM, a Dual-Mind World Model that combines a fast RSSM-based System 1 (RSSM-S1) with a logic-integrated System 2 (LINN-S2). LINN-S2 enforces logical consistency over extended horizons via hierarchical deep reasoning and logic regularizers, while an inter-system feedback loop couples the two components and refines domain-specific logics through a logic-augmented ELBO. Empirical results on DMControl and robotic benchmarks show that DMWM yields superior logical consistency, data and trial efficiency, and robust long-term imagination compared to RSSM-based baselines and gradient-based MPC, including significant gains under constrained data and horizon sizes. The approach promises improved robustness and interpretability for model-based RL and MPC, with potential as a step toward more general, logic-aware world models, though it relies on predefined logical rules that could be learned end-to-end in future work.

Abstract

Imagination in world models is crucial for enabling agents to learn long-horizon policy in a sample-efficient manner. Existing recurrent state-space model (RSSM)-based world models depend on single-step statistical inference to capture the environment dynamics, and, hence, they are unable to perform long-term imagination tasks due to the accumulation of prediction errors. Inspired by the dual-process theory of human cognition, we propose a novel dual-mind world model (DMWM) framework that integrates logical reasoning to enable imagination with logical consistency. DMWM is composed of two components: an RSSM-based System 1 (RSSM-S1) component that handles state transitions in an intuitive manner and a logic-integrated neural network-based System 2 (LINN-S2) component that guides the imagination process through hierarchical deep logical reasoning. The inter-system feedback mechanism is designed to ensure that the imagination process follows the logical rules of the real environment. The proposed framework is evaluated on benchmark tasks that require long-term planning from the DMControl suite. Extensive experimental results demonstrate that the proposed framework yields significant improvements in terms of logical coherence, trial efficiency, data efficiency and long-term imagination over the state-of-the-art world models.

DMWM: Dual-Mind World Model with Long-Term Imagination

TL;DR

This work addresses the challenge of reliable long-horizon imagination in world models by introducing DMWM, a Dual-Mind World Model that combines a fast RSSM-based System 1 (RSSM-S1) with a logic-integrated System 2 (LINN-S2). LINN-S2 enforces logical consistency over extended horizons via hierarchical deep reasoning and logic regularizers, while an inter-system feedback loop couples the two components and refines domain-specific logics through a logic-augmented ELBO. Empirical results on DMControl and robotic benchmarks show that DMWM yields superior logical consistency, data and trial efficiency, and robust long-term imagination compared to RSSM-based baselines and gradient-based MPC, including significant gains under constrained data and horizon sizes. The approach promises improved robustness and interpretability for model-based RL and MPC, with potential as a step toward more general, logic-aware world models, though it relies on predefined logical rules that could be learned end-to-end in future work.

Abstract

Imagination in world models is crucial for enabling agents to learn long-horizon policy in a sample-efficient manner. Existing recurrent state-space model (RSSM)-based world models depend on single-step statistical inference to capture the environment dynamics, and, hence, they are unable to perform long-term imagination tasks due to the accumulation of prediction errors. Inspired by the dual-process theory of human cognition, we propose a novel dual-mind world model (DMWM) framework that integrates logical reasoning to enable imagination with logical consistency. DMWM is composed of two components: an RSSM-based System 1 (RSSM-S1) component that handles state transitions in an intuitive manner and a logic-integrated neural network-based System 2 (LINN-S2) component that guides the imagination process through hierarchical deep logical reasoning. The inter-system feedback mechanism is designed to ensure that the imagination process follows the logical rules of the real environment. The proposed framework is evaluated on benchmark tasks that require long-term planning from the DMControl suite. Extensive experimental results demonstrate that the proposed framework yields significant improvements in terms of logical coherence, trial efficiency, data efficiency and long-term imagination over the state-of-the-art world models.

Paper Structure

This paper contains 29 sections, 26 equations, 13 figures, 6 tables, 2 algorithms.

Figures (13)

  • Figure 1: The proposed framework for DMWM.
  • Figure 2: Logic reasoning for LINN-S2.
  • Figure 3: Heatmap of deep logic correlations for sequential imagination $s_0 \land a_0 \land ... s_{29} \land a_{29} \rightarrow s_{30}$ with reasoning depth $\alpha = 30$. The horizontal axis indicates the past states $s_i$ and the vertical axis indicates past actions $a_j$. The color of points represents the logical strength of long-term state-action pairs.
  • Figure 4: Performance comparison of results on 4 DMC tasks under environment trials that indicate the number of times that models explore the environments. The vertical axis indicates the average return over 100 test episodes. Complete results on 20 DMC tasks are concluded in Appendix \ref{['eps']}.
  • Figure 5: Performance comparison on 4 DMC tasks under environment steps that indicate the number of environment interactions. The vertical axis denotes the average test return over 100 episodes. Complete test results on 20 DMC tasks are provided in Appendix \ref{['es']}.
  • ...and 8 more figures