Table of Contents
Fetching ...

DDAT: Diffusion Policies Enforcing Dynamically Admissible Robot Trajectories

Jean-Baptiste Bouvier, Kanghyun Ryu, Kartik Nagpal, Qiayuan Liao, Koushil Sreenath, Negar Mehr

TL;DR

The paper tackles the challenge of generating dynamically admissible robot trajectories with diffusion models in a black-box dynamic setting. It introduces DDAT, a framework that enforces dynamic feasibility by projecting predicted trajectories onto a dynamically admissible manifold, using polytopic underapproximations of reachable sets and several projection strategies that can be applied during training and inference. The authors demonstrate that combining state and action predictions with projection-based admissibility substantially improves SAE and CAE across multiple simulated and real-world platforms, including a quadcopter and Unitree GO1/GO2, and show that projection timing and curriculum critically impact trajectory quality. The approach enables more reliable, long-horizon planning with diffusion models in underactuated robotics, reducing the need for continual replanning and enhancing practical deployment. DDAT’s results suggest substantial potential for scalable, dynamically feasible diffusion-based planning in complex robotic systems, with future work targeting offline settings, learning dynamics, and faster inference for closed-loop control.

Abstract

Diffusion models excel at creating images and videos thanks to their multimodal generative capabilities. These same capabilities have made diffusion models increasingly popular in robotics research, where they are used for generating robot motion. However, the stochastic nature of diffusion models is fundamentally at odds with the precise dynamical equations describing the feasible motion of robots. Hence, generating dynamically admissible robot trajectories is a challenge for diffusion models. To alleviate this issue, we introduce DDAT: Diffusion policies for Dynamically Admissible Trajectories to generate provably admissible trajectories of black-box robotic systems using diffusion models. A sequence of states is a dynamically admissible trajectory if each state of the sequence belongs to the reachable set of its predecessor by the robot's equations of motion. To generate such trajectories, our diffusion policies project their predictions onto a dynamically admissible manifold during both training and inference to align the objective of the denoiser neural network with the dynamical admissibility constraint. The auto-regressive nature of these projections along with the black-box nature of robot dynamics render these projections immensely challenging. We thus enforce admissibility by iteratively sampling a polytopic under-approximation of the reachable set of a state onto which we project its predicted successor, before iterating this process with the projected successor. By producing accurate trajectories, this projection eliminates the need for diffusion models to continually replan, enabling one-shot long-horizon trajectory planning. We demonstrate that our framework generates higher quality dynamically admissible robot trajectories through extensive simulations on a quadcopter and various MuJoCo environments, along with real-world experiments on a Unitree GO1 and GO2.

DDAT: Diffusion Policies Enforcing Dynamically Admissible Robot Trajectories

TL;DR

The paper tackles the challenge of generating dynamically admissible robot trajectories with diffusion models in a black-box dynamic setting. It introduces DDAT, a framework that enforces dynamic feasibility by projecting predicted trajectories onto a dynamically admissible manifold, using polytopic underapproximations of reachable sets and several projection strategies that can be applied during training and inference. The authors demonstrate that combining state and action predictions with projection-based admissibility substantially improves SAE and CAE across multiple simulated and real-world platforms, including a quadcopter and Unitree GO1/GO2, and show that projection timing and curriculum critically impact trajectory quality. The approach enables more reliable, long-horizon planning with diffusion models in underactuated robotics, reducing the need for continual replanning and enhancing practical deployment. DDAT’s results suggest substantial potential for scalable, dynamically feasible diffusion-based planning in complex robotic systems, with future work targeting offline settings, learning dynamics, and faster inference for closed-loop control.

Abstract

Diffusion models excel at creating images and videos thanks to their multimodal generative capabilities. These same capabilities have made diffusion models increasingly popular in robotics research, where they are used for generating robot motion. However, the stochastic nature of diffusion models is fundamentally at odds with the precise dynamical equations describing the feasible motion of robots. Hence, generating dynamically admissible robot trajectories is a challenge for diffusion models. To alleviate this issue, we introduce DDAT: Diffusion policies for Dynamically Admissible Trajectories to generate provably admissible trajectories of black-box robotic systems using diffusion models. A sequence of states is a dynamically admissible trajectory if each state of the sequence belongs to the reachable set of its predecessor by the robot's equations of motion. To generate such trajectories, our diffusion policies project their predictions onto a dynamically admissible manifold during both training and inference to align the objective of the denoiser neural network with the dynamical admissibility constraint. The auto-regressive nature of these projections along with the black-box nature of robot dynamics render these projections immensely challenging. We thus enforce admissibility by iteratively sampling a polytopic under-approximation of the reachable set of a state onto which we project its predicted successor, before iterating this process with the projected successor. By producing accurate trajectories, this projection eliminates the need for diffusion models to continually replan, enabling one-shot long-horizon trajectory planning. We demonstrate that our framework generates higher quality dynamically admissible robot trajectories through extensive simulations on a quadcopter and various MuJoCo environments, along with real-world experiments on a Unitree GO1 and GO2.

Paper Structure

This paper contains 42 sections, 23 equations, 8 figures, 8 tables, 8 algorithms.

Figures (8)

  • Figure 1: Illustration of trajectory projection \ref{['eq: projection']}. The extremal actions $v_1, v_2, v_3, v_4$ of the admissible action set $\mathcal{A}$ are applied to $s_t$ with dynamics $f$ to get extremal next states $f(s_t, v_1), f(s_t, v_2), f(s_t, v_3), f(s_t, v_4)$. Their convex hull generates a polytope $\mathcal{C}(s_t)$ underapproximating the actual reachable set $\mathcal{R}(s_t)$. The predicted next state $\tilde{s}_{t+1}$ is then projected by $\mathcal{P}$ onto $\mathcal{C}(s_t)$ to obtain admissible next state $s_{t+1}$.
  • Figure 2: Diverging projection of the Hopper height. Our diffusion model samples the orange trajectory given the initial state of the blue trajectory, which is loaded from our dataset. The projection of orange with Algorithm \ref{['alg: proj']} yields the diverging but admissible green trajectory. However, green is not the admissible trajectory closest to orange since blue is admissible and much closer to orange than green.
  • Figure 3: Illustration of the reduced search space $\hat{\mathcal{C}}(s_t)$ of \ref{['eq: shrunk reachable set approx']} resulting from the reduced action space $\mathop{\mathrm{conv}}\nolimits(\hat{v}_1, \hat{v}_2, \hat{v}_3, \hat{v}_4)$ of \ref{['eq: shrunk action vertices']} by a factor $\delta$ surrounding action prediction $\tilde{a}_t$. Finding the best next state $s_{t+1}$ is much easier in the smaller set $\hat{\mathcal{C}}(s_t)$ than in the larger $\mathcal{C}(s_t)$.
  • Figure 4: Illustration of projectors $\textcolor{blue!70}{\mathcal{P}^A} = f(s_t, \textcolor{blue!70}{\tilde{a}_t})$ and $\textcolor{green!70!black}{\mathcal{P}^SA} = f(s_t, \textcolor{green!70!black}{\tilde{a}_t + \delta a_t})$ with its feedback correction policy $\pi_\theta$ leveraging the open-loop error to generate a corrective action $\delta a_t$ instead of discarding prediction $\tilde{s}_{t+1}$ like $\textcolor{blue!70}{\mathcal{P}^A}$.
  • Figure 5: Ratios of trajectories deployed open-loop having fallen at a given timestep for the Hopper and Walker. The shade is the maximum and minimum number of trajectories having fallen at each timestep over 5 runs of 100 trajectories each.
  • ...and 3 more figures

Theorems & Definitions (6)

  • Definition 1
  • Definition 2
  • Remark 1
  • Definition 3
  • Definition 4
  • Remark 2