Dynamical Diffusion: Learning Temporal Dynamics with Diffusion Models
Xingzhuo Guo, Yu Zhang, Baixu Chen, Haoran Xu, Jianmin Wang, Mingsheng Long
TL;DR
Temporal predictive learning with diffusion models is limited by underutilization of inherent dynamics. Dynamical Diffusion (DyDiff) introduces temporally aware forward and reverse processes, using a Dynamics function and a gamma schedule to couple current and history states, enabling efficient training via reparameterization and multi-step generation. Across spatiotemporal forecasting, video prediction, and time series forecasting, DyDiff consistently outperforms standard diffusion baselines and offers insights through ablations on latents, dependent noises, and gamma schedules. The work fills a significant gap by embedding temporal dynamics directly into the diffusion framework, with broad implications for reliable, temporally coherent forecasting in science and video domains. Code is released at the provided repository.
Abstract
Diffusion models have emerged as powerful generative frameworks by progressively adding noise to data through a forward process and then reversing this process to generate realistic samples. While these models have achieved strong performance across various tasks and modalities, their application to temporal predictive learning remains underexplored. Existing approaches treat predictive learning as a conditional generation problem, but often fail to fully exploit the temporal dynamics inherent in the data, leading to challenges in generating temporally coherent sequences. To address this, we introduce Dynamical Diffusion (DyDiff), a theoretically sound framework that incorporates temporally aware forward and reverse processes. Dynamical Diffusion explicitly models temporal transitions at each diffusion step, establishing dependencies on preceding states to better capture temporal dynamics. Through the reparameterization trick, Dynamical Diffusion achieves efficient training and inference similar to any standard diffusion model. Extensive experiments across scientific spatiotemporal forecasting, video prediction, and time series forecasting demonstrate that Dynamical Diffusion consistently improves performance in temporal predictive tasks, filling a crucial gap in existing methodologies. Code is available at this repository: https://github.com/thuml/dynamical-diffusion.
