DynaGuide: Steering Diffusion Polices with Active Dynamic Guidance
Maximilian Du, Shuran Song
TL;DR
DynaGuide tackles the challenge of steering large, pretrained diffusion policies without retraining by coupling a separate latent dynamics model with the diffusion denoising process. It computes a differentiable guidance metric in the DinoV2 latent space using predicted future observations and a set of positive/negative objectives, and injects its gradient into the action denoising step via DDIM, enabling multi-objective, robust steering that can amplify underrepresented behaviors and work with off-the-shelf policies. Across CALVIN simulations and real-robot experiments, DynaGuide yields up to 70–80% steering success and outperforms goal-conditioning by up to 5.4x under low-quality guidance, while also handling multiple objectives and novel behaviors. The approach offers a plug-and-play steering paradigm with practical impact for deploying complex robotic policies in uncertain real-world settings, though it relies on observation-based guidance and invites future multimodal guidance and memory-enabled extensions.
Abstract
Deploying large, complex policies in the real world requires the ability to steer them to fit the needs of a situation. Most common steering approaches, like goal-conditioning, require training the robot policy with a distribution of test-time objectives in mind. To overcome this limitation, we present DynaGuide, a steering method for diffusion policies using guidance from an external dynamics model during the diffusion denoising process. DynaGuide separates the dynamics model from the base policy, which gives it multiple advantages, including the ability to steer towards multiple objectives, enhance underrepresented base policy behaviors, and maintain robustness on low-quality objectives. The separate guidance signal also allows DynaGuide to work with off-the-shelf pretrained diffusion policies. We demonstrate the performance and features of DynaGuide against other steering approaches in a series of simulated and real experiments, showing an average steering success of 70% on a set of articulated CALVIN tasks and outperforming goal-conditioning by 5.4x when steered with low-quality objectives. We also successfully steer an off-the-shelf real robot policy to express preference for particular objects and even create novel behavior. Videos and more can be found on the project website: https://dynaguide.github.io
