On the Connection Between Diffusion Models and Molecular Dynamics
Liam Harcombe, Timothy T. Duignan
TL;DR
The paper tackles learning molecular force fields without explicit force labels by linking denoising diffusion models to molecular dynamics, identifying that the diffusion denoising output $F_{\text{DN}}$ approximates a scaled true force via $F_{\text{DN}} = \sigma^2 F_{\text{true}}$ and that the score $\nabla \log p$ corresponds to forces $F_{\text{true}} = \nabla \log \rho$. It provides a simple Taylor-expansion derivation and demonstrates a practical diffusion-based NNP implemented inside a standard MD workflow to simulate a coarse-grained LiCl solution, including data-duplication to boost data efficiency. Results show the augmented diffusion model with $500$ coordinate frames and data duplication can match a force-labelled NNP trained on thousands of frames, as reflected in RDFs aligning with the benchmark. The findings highlight practical data-efficiency gains and offer guidance on noise levels and training-data configurations for diffusion-based MD, enabling stable simulations with reduced force-label data.
Abstract
Neural Network Potentials (NNPs) have emerged as a powerful tool for modelling atomic interactions with high accuracy and computational efficiency. Recently, denoising diffusion models have shown promise in NNPs by training networks to remove noise added to stable configurations, eliminating the need for force data during training. In this work, we explore the connection between noise and forces by providing a new, simplified mathematical derivation of their relationship. We also demonstrate how a denoising model can be implemented using a conventional MD software package interfaced with a standard NNP architecture. We demonstrate the approach by training a diffusion-based NNP to simulate a coarse-grained lithium chloride solution and employ data duplication to enhance model performance.
