Enhancing Diffusion Policy with Classifier-Free Guidance for Temporal Robotic Tasks
Yuang Lu, Song Wang, Xiao Han, Xuri Zhang, Yucong Wu, Zhicheng He
TL;DR
The paper tackles termination and temporal-context challenges in diffusion-policy-driven robotic control for sequential tasks. It introduces a classifier-free guidance-based diffusion policy (CFG-DP) that conditions action generation on explicit timestep cues and employs a dynamic guidance factor to bias termination as a cycle nears completion. Real-world humanoid experiments in a screwing task demonstrate substantial gains: higher success rates, dramatically reduced repetitive actions, and robust termination, validated by ablations showing the importance of timestep inputs and tuned guidance strength. The approach promises more deterministic and reliable execution of time-dependent robotic tasks, with potential extensions to hierarchical and multi-task scenarios.
Abstract
Temporal sequential tasks challenge humanoid robots, as existing Diffusion Policy (DP) and Action Chunking with Transformers (ACT) methods often lack temporal context, resulting in local optima traps and excessive repetitive actions. To address these issues, this paper introduces a Classifier-Free Guidance-Based Diffusion Policy (CFG-DP), a novel framework to enhance DP by integrating Classifier-Free Guidance (CFG) with conditional and unconditional models. Specifically, CFG leverages timestep inputs to track task progression and ensure precise cycle termination. It dynamically adjusts action predictions based on task phase, using a guidance factor tuned to balance temporal coherence and action accuracy. Real-world experiments on a humanoid robot demonstrate high success rates and minimal repetitive actions. Furthermore, we assessed the model's ability to terminate actions and examined how different components and parameter adjustments affect its performance. This framework significantly enhances deterministic control and execution reliability for sequential robotic tasks.
