Table of Contents
Fetching ...

Enhancing Diffusion Policy with Classifier-Free Guidance for Temporal Robotic Tasks

Yuang Lu, Song Wang, Xiao Han, Xuri Zhang, Yucong Wu, Zhicheng He

TL;DR

The paper tackles termination and temporal-context challenges in diffusion-policy-driven robotic control for sequential tasks. It introduces a classifier-free guidance-based diffusion policy (CFG-DP) that conditions action generation on explicit timestep cues and employs a dynamic guidance factor to bias termination as a cycle nears completion. Real-world humanoid experiments in a screwing task demonstrate substantial gains: higher success rates, dramatically reduced repetitive actions, and robust termination, validated by ablations showing the importance of timestep inputs and tuned guidance strength. The approach promises more deterministic and reliable execution of time-dependent robotic tasks, with potential extensions to hierarchical and multi-task scenarios.

Abstract

Temporal sequential tasks challenge humanoid robots, as existing Diffusion Policy (DP) and Action Chunking with Transformers (ACT) methods often lack temporal context, resulting in local optima traps and excessive repetitive actions. To address these issues, this paper introduces a Classifier-Free Guidance-Based Diffusion Policy (CFG-DP), a novel framework to enhance DP by integrating Classifier-Free Guidance (CFG) with conditional and unconditional models. Specifically, CFG leverages timestep inputs to track task progression and ensure precise cycle termination. It dynamically adjusts action predictions based on task phase, using a guidance factor tuned to balance temporal coherence and action accuracy. Real-world experiments on a humanoid robot demonstrate high success rates and minimal repetitive actions. Furthermore, we assessed the model's ability to terminate actions and examined how different components and parameter adjustments affect its performance. This framework significantly enhances deterministic control and execution reliability for sequential robotic tasks.

Enhancing Diffusion Policy with Classifier-Free Guidance for Temporal Robotic Tasks

TL;DR

The paper tackles termination and temporal-context challenges in diffusion-policy-driven robotic control for sequential tasks. It introduces a classifier-free guidance-based diffusion policy (CFG-DP) that conditions action generation on explicit timestep cues and employs a dynamic guidance factor to bias termination as a cycle nears completion. Real-world humanoid experiments in a screwing task demonstrate substantial gains: higher success rates, dramatically reduced repetitive actions, and robust termination, validated by ablations showing the importance of timestep inputs and tuned guidance strength. The approach promises more deterministic and reliable execution of time-dependent robotic tasks, with potential extensions to hierarchical and multi-task scenarios.

Abstract

Temporal sequential tasks challenge humanoid robots, as existing Diffusion Policy (DP) and Action Chunking with Transformers (ACT) methods often lack temporal context, resulting in local optima traps and excessive repetitive actions. To address these issues, this paper introduces a Classifier-Free Guidance-Based Diffusion Policy (CFG-DP), a novel framework to enhance DP by integrating Classifier-Free Guidance (CFG) with conditional and unconditional models. Specifically, CFG leverages timestep inputs to track task progression and ensure precise cycle termination. It dynamically adjusts action predictions based on task phase, using a guidance factor tuned to balance temporal coherence and action accuracy. Real-world experiments on a humanoid robot demonstrate high success rates and minimal repetitive actions. Furthermore, we assessed the model's ability to terminate actions and examined how different components and parameter adjustments affect its performance. This framework significantly enhances deterministic control and execution reliability for sequential robotic tasks.

Paper Structure

This paper contains 19 sections, 5 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Overview of the proposed model architecture. Left: Illustration of the observation processing pipeline, showing how different input are combined and processed. Right: At time $t$, the policy processes the latest $T_o$ steps of observation $O_t$ and generates $T_a$ steps of actions $A_t$.
  • Figure 2: Comparison of DP and CFG-DP models in the screwing task. (a) The DP model exhibits repetitive cyclic actions, failing to terminate properly. (b) The CFG-DP model successfully completes the task with a precise termination action.
  • Figure 3: Action trajectories for the screwing task on the validation set, using wrist joint states. Top: Diffusion Policy (DP). Middle: Action Chunking with Transformers (ACT). Bottom: Classifier-Free Guidance Diffusion Policy (CFG-DP)
  • Figure 4: Distribution of termination steps for the screwing task.
  • Figure 5: Conditional entropy across ablation experiments
  • ...and 1 more figures