Table of Contents
Fetching ...

JAMUN: Bridging Smoothed Molecular Dynamics and Score-Based Learning for Conformational Ensembles

Ameya Daigavane, Bodhi P. Vani, Darcy Davidson, Saeed Saremi, Joshua Rackers, Joseph Kleinhenz

TL;DR

JAMUN addresses the slow sampling of protein conformational ensembles by performing Langevin dynamics in a smoothed coordinate space and denoising to generate all-atom conformations, effectively bridging traditional MD with score-based learning. The method uses a SE(3)-equivariant graph neural denoiser trained at a fixed noise level, enabling decoupled walk and jump steps that yield faster mixing while preserving MD-like priors. Across multiple small-peptide and macrocycle datasets, JAMUN achieves MD-like conformational coverage, significant decorrelation speedups, and competitive or superior sampling efficiency compared to baselines such as TBG and MDGen, with demonstrated transferability to unseen peptide lengths. This approach offers a practical route to rapid, physically grounded ensemble generation for drug discovery and cryptic-pocket exploration, with open-source code and weights available for reuse and extension.

Abstract

Conformational ensembles of protein structures are immensely important both for understanding protein function and drug discovery in novel modalities such as cryptic pockets. Current techniques for sampling ensembles such as molecular dynamics (MD) are computationally inefficient, while many recent machine learning methods do not transfer to systems outside their training data. We propose JAMUN which performs MD in a smoothed, noised space of all-atom 3D conformations of molecules by utilizing the framework of walk-jump sampling. JAMUN enables ensemble generation for small peptides at rates of an order of magnitude faster than traditional molecular dynamics. The physical priors in JAMUN enables transferability to systems outside of its training data, even to peptides that are longer than those originally trained on. Our model, code and weights are available at https://github.com/prescient-design/jamun.

JAMUN: Bridging Smoothed Molecular Dynamics and Score-Based Learning for Conformational Ensembles

TL;DR

JAMUN addresses the slow sampling of protein conformational ensembles by performing Langevin dynamics in a smoothed coordinate space and denoising to generate all-atom conformations, effectively bridging traditional MD with score-based learning. The method uses a SE(3)-equivariant graph neural denoiser trained at a fixed noise level, enabling decoupled walk and jump steps that yield faster mixing while preserving MD-like priors. Across multiple small-peptide and macrocycle datasets, JAMUN achieves MD-like conformational coverage, significant decorrelation speedups, and competitive or superior sampling efficiency compared to baselines such as TBG and MDGen, with demonstrated transferability to unseen peptide lengths. This approach offers a practical route to rapid, physically grounded ensemble generation for drug discovery and cryptic-pocket exploration, with open-source code and weights available for reuse and extension.

Abstract

Conformational ensembles of protein structures are immensely important both for understanding protein function and drug discovery in novel modalities such as cryptic pockets. Current techniques for sampling ensembles such as molecular dynamics (MD) are computationally inefficient, while many recent machine learning methods do not transfer to systems outside their training data. We propose JAMUN which performs MD in a smoothed, noised space of all-atom 3D conformations of molecules by utilizing the framework of walk-jump sampling. JAMUN enables ensemble generation for small peptides at rates of an order of magnitude faster than traditional molecular dynamics. The physical priors in JAMUN enables transferability to systems outside of its training data, even to peptides that are longer than those originally trained on. Our model, code and weights are available at https://github.com/prescient-design/jamun.

Paper Structure

This paper contains 30 sections, 47 equations, 24 figures, 10 tables, 1 algorithm.

Figures (24)

  • Figure 1: Overview of the JAMUN sampling process, where an initial conformation is noised, propagated and denoised to obtain new conformations.
  • Figure 2: Depictions of the a) initial noising, b) walk and c) jump steps in JAMUN.
  • Figure 3: A side-by-side comparison of uncapped (left) compared to capped (right) ALA-CYS. The acetyl (ACE) and N-methyl (NME) capping groups provide steric hindrance and prevent local charge interactions on the N-terminal and C-terminal ends.
  • Figure 4: Comparing noise sensitivity for an example test peptide GCSL from Timewarp 4AA-Large for JAMUN, sampled identically, showing the tradeoff between slower mode mixing at $\sigma = 0.2 \angstrom$, and broken topologies at $\sigma = 0.8 \angstrom$.
  • Figure 5: JAMUN samples on unseen KADL when trained on Timewarp 4AA-Large. The full animation is available at https://github.com/prescient-design/jamun.
  • ...and 19 more figures