Table of Contents
Fetching ...

Delay-Aware Diffusion Policy: Bridging the Observation-Execution Gap in Dynamic Tasks

Aileen Liao, Dong-Ki Kim, Max Olan Smith, Ali-akbar Agha-mohammadi, Shayegan Omidshafiei

TL;DR

This work addresses the critical problem of inference delay in dynamic robotic tasks, where actions are computed after observations that no longer reflect the actual state. It introduces Delay-Aware Diffusion Policy (DA-DP), which corrects zero-delay training trajectories to account for execution delay and conditions the diffusion policy on measured delay, enhancing robustness to latency. Through experiments across multiple tasks and robots, DA-DP demonstrates superior performance over delay-unaware baselines, maintains stability under varying delays, and generalizes to new morphologies and out-of-distribution delays. The approach provides a practical, plug-and-play pattern for delay-aware imitation learning and urges reporting performance as a function of latency, not just task difficulty.

Abstract

As a robot senses and selects actions, the world keeps changing. This inference delay creates a gap of tens to hundreds of milliseconds between the observed state and the state at execution. In this work, we take the natural generalization from zero delay to measured delay during training and inference. We introduce Delay-Aware Diffusion Policy (DA-DP), a framework for explicitly incorporating inference delays into policy learning. DA-DP corrects zero-delay trajectories to their delay-compensated counterparts, and augments the policy with delay conditioning. We empirically validate DA-DP on a variety of tasks, robots, and delays and find its success rate more robust to delay than delay-unaware methods. DA-DP is architecture agnostic and transfers beyond diffusion policies, offering a general pattern for delay-aware imitation learning. More broadly, DA-DP encourages evaluation protocols that report performance as a function of measured latency, not just task difficulty.

Delay-Aware Diffusion Policy: Bridging the Observation-Execution Gap in Dynamic Tasks

TL;DR

This work addresses the critical problem of inference delay in dynamic robotic tasks, where actions are computed after observations that no longer reflect the actual state. It introduces Delay-Aware Diffusion Policy (DA-DP), which corrects zero-delay training trajectories to account for execution delay and conditions the diffusion policy on measured delay, enhancing robustness to latency. Through experiments across multiple tasks and robots, DA-DP demonstrates superior performance over delay-unaware baselines, maintains stability under varying delays, and generalizes to new morphologies and out-of-distribution delays. The approach provides a practical, plug-and-play pattern for delay-aware imitation learning and urges reporting performance as a function of latency, not just task difficulty.

Abstract

As a robot senses and selects actions, the world keeps changing. This inference delay creates a gap of tens to hundreds of milliseconds between the observed state and the state at execution. In this work, we take the natural generalization from zero delay to measured delay during training and inference. We introduce Delay-Aware Diffusion Policy (DA-DP), a framework for explicitly incorporating inference delays into policy learning. DA-DP corrects zero-delay trajectories to their delay-compensated counterparts, and augments the policy with delay conditioning. We empirically validate DA-DP on a variety of tasks, robots, and delays and find its success rate more robust to delay than delay-unaware methods. DA-DP is architecture agnostic and transfers beyond diffusion policies, offering a general pattern for delay-aware imitation learning. More broadly, DA-DP encourages evaluation protocols that report performance as a function of measured latency, not just task difficulty.

Paper Structure

This paper contains 14 sections, 8 equations, 10 figures, 2 algorithms.

Figures (10)

  • Figure 1: While Diffusion Policy (DP) struggles with computation delays and fails to hit the ball in the ping-pong task, our Delay-Aware Diffusion Policy (DA-DP) successfully handles highly dynamic, reactive tasks under inference delays.
  • Figure 2: Overview of Delay-Aware Diffusion Policy (DA-DP). The DA-DP framework is depicted for the case with action chunk length $H_\text{act}=4$ and inference delay $\delta=2\Delta t$, as an example.
  • Figure 3: Three dynamic environments for experimental evaluations. The pick up rolling ball task consists of a ball rolling across a table and the Franka Emika Panda robotic arm picking it up and holding it at a specified location. The ping-pong task included the Panda arm hitting a ball after a serve. Lastly, the Pick and place moving box task included a G1 Unitree Humanoid picking up a box sliding across the table and placing it on the opposing table.
  • Figure 4: Performance comparisons in the pick up rolling ball domain across different constant inference delays.
  • Figure 5: Performance comparisons in the ping-pong domain across different constant inference delays.
  • ...and 5 more figures

Theorems & Definitions (1)

  • Remark 1: Discrete implementation