Diffusion Models are Molecular Dynamics Simulators
Justin Diamond, Markus Lill
TL;DR
This work reframes diffusion-based molecular sampling as a form of molecular dynamics by equipping denoising diffusion steps with a simple harmonic adapter that creates a quadratic coupling between consecutive states. The key result is an exact EM equivalence: each reverse-diffusion step with the adapter corresponds to one EM step for overdamped Langevin dynamics, with an implicit time step Δt = β/(2k) set by the spring. The authors derive a finite-schedule KL bound showing convergence to MD as the grid is refined and the score model improves, and they provide a continuous-time limit, a practical algorithm, and a path to time-parallel trajectory generation. Empirically, the approach yields MD-like trajectories and Boltzmann-consistent statistics from static configurations, enabling trajectory-level observables with potentially orders-of-magnitude speedups and flexible coupling to MCMC, metadynamics, and alchemical methods. Overall, this work paves the way for data-driven, scalable MD that preserves thermodynamics while leveraging diffusion-model training and parallelism.
Abstract
We prove that a denoising diffusion sampler equipped with a sequential bias across the batch dimension is exactly an Euler-Maruyama integrator for overdamped Langevin dynamics. Each reverse denoising step, with its associated spring stiffness, can be interpreted as one step of a stochastic differential equation with an effective time step set jointly by the noise schedule and that stiffness. The learned score then plays the role of the drift, equivalently the gradient of a learned energy, yielding a precise correspondence between diffusion sampling and Langevin time evolution. This equivalence recasts molecular dynamics (MD) in terms of diffusion models. Accuracy is no longer tied to a fixed, extremely small MD time step; instead, it is controlled by two scalable knobs: model capacity, which governs how well the drift is approximated, and the number of denoising steps, which sets the integrator resolution. In practice, this leads to a fully data-driven MD framework that learns forces from uncorrelated equilibrium snapshots, requires no hand-engineered force fields, uses no trajectory data for training, and still preserves the Boltzmann distribution associated with the learned energy. We derive trajectory-level, information-theoretic error bounds that cleanly separate discretization error from score-model error, clarify how temperature enters through the effective spring, and show that the resulting sampler generates molecular trajectories with MD-like temporal correlations, even though the model is trained only on static configurations.
