Table of Contents
Fetching ...

Model-Based Diffusion Sampling for Predictive Control in Offline Decision Making

Haldun Balim, Na Li, Yilun Du

TL;DR

Offline decision-making requires reliable, safe trajectories from fixed data. MPDiffuser presents a compositional diffusion framework with a planner, a forward dynamics diffusion model, and a ranker that are alternately applied during sampling to enforce both task fidelity and dynamic feasibility. The approach delivers improved feasibility, sample efficiency, and adaptability across standard offline benchmarks, constrained settings, vision-based extensions, and real-world robotics. This combination of forward dynamics guidance and planner-driven objectives yields practical benefits for safety-critical control in offline regimes.

Abstract

Offline decision-making requires synthesizing reliable behaviors from fixed datasets without further interaction, yet existing generative approaches often yield trajectories that are dynamically infeasible. We propose Model Predictive Diffuser (MPDiffuser), a compositional model-based diffusion framework consisting of: (i) a planner that generates diverse, task-aligned trajectories; (ii) a dynamics model that enforces consistency with the underlying system dynamics; and (iii) a ranker module that selects behaviors aligned with the task objectives. MPDiffuser employs an alternating diffusion sampling scheme, where planner and dynamics updates are interleaved to progressively refine trajectories for both task alignment and feasibility during the sampling process. We also provide a theoretical rationale for this procedure, showing how it balances fidelity to data priors with dynamics consistency. Empirically, the compositional design improves sample efficiency, as it leverages even low-quality data for dynamics learning and adapts seamlessly to novel dynamics. We evaluate MPDiffuser on both unconstrained (D4RL) and constrained (DSRL) offline decision-making benchmarks, demonstrating consistent gains over existing approaches. Furthermore, we present a preliminary study extending MPDiffuser to vision-based control tasks, showing its potential to scale to high-dimensional sensory inputs. Finally, we deploy our method on a real quadrupedal robot, showcasing its practicality for real-world control.

Model-Based Diffusion Sampling for Predictive Control in Offline Decision Making

TL;DR

Offline decision-making requires reliable, safe trajectories from fixed data. MPDiffuser presents a compositional diffusion framework with a planner, a forward dynamics diffusion model, and a ranker that are alternately applied during sampling to enforce both task fidelity and dynamic feasibility. The approach delivers improved feasibility, sample efficiency, and adaptability across standard offline benchmarks, constrained settings, vision-based extensions, and real-world robotics. This combination of forward dynamics guidance and planner-driven objectives yields practical benefits for safety-critical control in offline regimes.

Abstract

Offline decision-making requires synthesizing reliable behaviors from fixed datasets without further interaction, yet existing generative approaches often yield trajectories that are dynamically infeasible. We propose Model Predictive Diffuser (MPDiffuser), a compositional model-based diffusion framework consisting of: (i) a planner that generates diverse, task-aligned trajectories; (ii) a dynamics model that enforces consistency with the underlying system dynamics; and (iii) a ranker module that selects behaviors aligned with the task objectives. MPDiffuser employs an alternating diffusion sampling scheme, where planner and dynamics updates are interleaved to progressively refine trajectories for both task alignment and feasibility during the sampling process. We also provide a theoretical rationale for this procedure, showing how it balances fidelity to data priors with dynamics consistency. Empirically, the compositional design improves sample efficiency, as it leverages even low-quality data for dynamics learning and adapts seamlessly to novel dynamics. We evaluate MPDiffuser on both unconstrained (D4RL) and constrained (DSRL) offline decision-making benchmarks, demonstrating consistent gains over existing approaches. Furthermore, we present a preliminary study extending MPDiffuser to vision-based control tasks, showing its potential to scale to high-dimensional sensory inputs. Finally, we deploy our method on a real quadrupedal robot, showcasing its practicality for real-world control.

Paper Structure

This paper contains 42 sections, 25 equations, 17 figures, 17 tables, 2 algorithms.

Figures (17)

  • Figure 1: Framework Overview.Left: Our proposed framework, MPDiffuser, which couples a diffusion-based planner with a diffusion-based dynamics model, complemented by a ranking module. Right: A comparison highlighting key differences between MPDiffuser and prior diffusion-based trajectory generation methods.
  • Figure 2: Illustrative scenario: We compare sampled state trajectories with the open-loop simulations obtained by simulating the sampled actions on a 5-dimensional kinematic bike model. Diffuser fails to generate admissible state trajectories that reach the goal, Decision Diffuser produces plausible states whose actions diverge under simulation. In contrast, MPDiffuser yields trajectories that remain faithful to the system dynamics.
  • Figure 3: Fetch PickandPlace
  • Figure 4: Dynamics consistency of sampled trajectories. Mean state error over the prediction horizon for: block position in world coordinates, block position relative to the end-effector, and all state dimensions combined. MPDiffuser achieves lower state prediction error compared to Decision Diffuser and the planner-only baseline, indicating improved consistency with system dynamics.
  • Figure 5: Walker2D visualization (highlights defective joint)
  • ...and 12 more figures