Theoretical Closed-loop Stability Bounds for Dynamical System Coupled with Diffusion Policies
Gabriel Lauzier, Alexandre Girard, François Ferland
TL;DR
The paper addresses the challenge of real-time control with diffusion-based policies by analyzing the stability of a linear plant when the denoising (diffusion) process is partially integrated into the control loop. It derives closed-form stability bounds for both 1-D and N-D systems and demonstrates how the time-scale ratio, diffusion gain, and demonstration variance jointly determine stability, with a key result that partial diffusion can be equivalently stable to full diffusion under certain conditions. A practical outcome is a variance-based dataset quality metric and a framework for trading denoising depth against latency, enabling faster imitation-learning in robotics. The work also discusses limitations, notably the assumption of adequate computation time for action planning, and suggests future extensions to stochasticity, nonlinear dynamics, and high-dimensional (vision-conditioned) validation.
Abstract
Diffusion Policy has shown great performance in robotic manipulation tasks under stochastic perturbations, due to its ability to model multimodal action distributions. Nonetheless, its reliance on a computationally expensive reverse-time diffusion (denoising) process, for action inference, makes it challenging to use for real-time applications where quick decision-making is mandatory. This work studies the possibility of conducting the denoising process only partially before executing an action, allowing the plant to evolve according to its dynamics in parallel to the reverse-time diffusion dynamics ongoing on the computer. In a classical diffusion policy setting, the plant dynamics are usually slow and the two dynamical processes are uncoupled. Here, we investigate theoretical bounds on the stability of closed-loop systems using diffusion policies when the plant dynamics and the denoising dynamics are coupled. The contribution of this work gives a framework for faster imitation learning and a metric that yields if a controller will be stable based on the variance of the demonstrations.
