Table of Contents
Fetching ...

On the Connection Between Diffusion Models and Molecular Dynamics

Liam Harcombe, Timothy T. Duignan

TL;DR

The paper tackles learning molecular force fields without explicit force labels by linking denoising diffusion models to molecular dynamics, identifying that the diffusion denoising output $F_{\text{DN}}$ approximates a scaled true force via $F_{\text{DN}} = \sigma^2 F_{\text{true}}$ and that the score $\nabla \log p$ corresponds to forces $F_{\text{true}} = \nabla \log \rho$. It provides a simple Taylor-expansion derivation and demonstrates a practical diffusion-based NNP implemented inside a standard MD workflow to simulate a coarse-grained LiCl solution, including data-duplication to boost data efficiency. Results show the augmented diffusion model with $500$ coordinate frames and data duplication can match a force-labelled NNP trained on thousands of frames, as reflected in RDFs aligning with the benchmark. The findings highlight practical data-efficiency gains and offer guidance on noise levels and training-data configurations for diffusion-based MD, enabling stable simulations with reduced force-label data.

Abstract

Neural Network Potentials (NNPs) have emerged as a powerful tool for modelling atomic interactions with high accuracy and computational efficiency. Recently, denoising diffusion models have shown promise in NNPs by training networks to remove noise added to stable configurations, eliminating the need for force data during training. In this work, we explore the connection between noise and forces by providing a new, simplified mathematical derivation of their relationship. We also demonstrate how a denoising model can be implemented using a conventional MD software package interfaced with a standard NNP architecture. We demonstrate the approach by training a diffusion-based NNP to simulate a coarse-grained lithium chloride solution and employ data duplication to enhance model performance.

On the Connection Between Diffusion Models and Molecular Dynamics

TL;DR

The paper tackles learning molecular force fields without explicit force labels by linking denoising diffusion models to molecular dynamics, identifying that the diffusion denoising output approximates a scaled true force via and that the score corresponds to forces . It provides a simple Taylor-expansion derivation and demonstrates a practical diffusion-based NNP implemented inside a standard MD workflow to simulate a coarse-grained LiCl solution, including data-duplication to boost data efficiency. Results show the augmented diffusion model with coordinate frames and data duplication can match a force-labelled NNP trained on thousands of frames, as reflected in RDFs aligning with the benchmark. The findings highlight practical data-efficiency gains and offer guidance on noise levels and training-data configurations for diffusion-based MD, enabling stable simulations with reduced force-label data.

Abstract

Neural Network Potentials (NNPs) have emerged as a powerful tool for modelling atomic interactions with high accuracy and computational efficiency. Recently, denoising diffusion models have shown promise in NNPs by training networks to remove noise added to stable configurations, eliminating the need for force data during training. In this work, we explore the connection between noise and forces by providing a new, simplified mathematical derivation of their relationship. We also demonstrate how a denoising model can be implemented using a conventional MD software package interfaced with a standard NNP architecture. We demonstrate the approach by training a diffusion-based NNP to simulate a coarse-grained lithium chloride solution and employ data duplication to enhance model performance.

Paper Structure

This paper contains 10 sections, 12 equations, 5 figures.

Figures (5)

  • Figure 1: A flow diagram illustrating the process of constructing a NNP using a diffusion-based training scheme.
  • Figure 2: Pictorial description of the diffusion model and data duplication methodology.
  • Figure 3: RDF plots for the NNP trained on 500 frames of force data (top row, 500-forces), and the augmented diffusion model (bottom row, 500-aug-diff), compared to the traditional NNP trained on 3000 frames of force data (3000-forces).
  • Figure 4: RDF plots for the regular diffusion model trained on 1000 frames of data compared to the benchmark traditional NNP.
  • Figure 5: RDF plots for the regular diffusion model trained on 500 frames of data compared to the benchmark traditional NNP.