Table of Contents
Fetching ...

A Priori Sampling of Transition States with Guided Diffusion

Hyukjun Lim, Soojung Yang, Lucas Pinède, Miguel Steiner, Yuanqi Du, Rafael Gómez-Bombarelli

Abstract

Transition states, the first-order saddle points on the potential energy surfaces, govern the kinetics and mechanisms of chemical reactions and conformational changes. Locating them is challenging because transition pathways are topologically complex and can proceed via an ensemble of diverse routes. Existing methods address these challenges by introducing heuristic assumptions about the pathway or reaction coordinates, which limits their applicability when a good initial guess is unavailable or when the guess precludes alternative, potentially relevant pathways. We propose to bypass such heuristic limitations by introducing ASTRA, A Priori Sampling of TRAnsition States with Guided Diffusion, which reframes the transition state search as an inference-time scaling problem for generative models. ASTRA trains a score-based diffusion model on configurations from known metastable states. Then, ASTRA guides inference toward the isodensity surface separating the basins of metastable states via a principled composition of conditional scores. A Score-Aligned Ascent (SAA) process then approximates a reaction coordinate from the difference between conditioned scores and combines it with physical forces to drive convergence onto first-order transition states. Validated on benchmarks from 2D potentials to biomolecular conformational changes and chemical reaction, ASTRA locates transition states with high precision and discovers multiple reaction pathways, enabling mechanistic studies of complex molecular systems.

A Priori Sampling of Transition States with Guided Diffusion

Abstract

Transition states, the first-order saddle points on the potential energy surfaces, govern the kinetics and mechanisms of chemical reactions and conformational changes. Locating them is challenging because transition pathways are topologically complex and can proceed via an ensemble of diverse routes. Existing methods address these challenges by introducing heuristic assumptions about the pathway or reaction coordinates, which limits their applicability when a good initial guess is unavailable or when the guess precludes alternative, potentially relevant pathways. We propose to bypass such heuristic limitations by introducing ASTRA, A Priori Sampling of TRAnsition States with Guided Diffusion, which reframes the transition state search as an inference-time scaling problem for generative models. ASTRA trains a score-based diffusion model on configurations from known metastable states. Then, ASTRA guides inference toward the isodensity surface separating the basins of metastable states via a principled composition of conditional scores. A Score-Aligned Ascent (SAA) process then approximates a reaction coordinate from the difference between conditioned scores and combines it with physical forces to drive convergence onto first-order transition states. Validated on benchmarks from 2D potentials to biomolecular conformational changes and chemical reaction, ASTRA locates transition states with high precision and discovers multiple reaction pathways, enabling mechanistic studies of complex molecular systems.

Paper Structure

This paper contains 46 sections, 18 equations, 26 figures, 4 tables, 1 algorithm.

Figures (26)

  • Figure 1: Overview of ASTRA. The method consists of three stages: (1) training a conditional generative model, (2) sampling from an isodensity surface, and (3) inference-time guidance that combines a reaction coordinate approximated from score differences of the two conditional models $\theta^A, \theta^B$ ($R = S_\theta^A - S_\theta^B$) with physical forces to rapidly sample transition states defined as first-order saddle points.
  • Figure 2: Our method discovers transition state regions with high precision. It finds a transition state for the (a) double well potential, and multiple transition states for the (b) Müller-Brown potential and (c) double path potential.
  • Figure 3: Application of our method to chemical systems. (a) For the alanine dipeptide, the generated samples (circles) localize the transition region between two states. (b) For chignolin, the free energy landscape is projected onto the two slowest time-independent components learned from a converged simulation lindorff2011fast. The background densities in both plots correspond to the training data distribution where distinct colors indicate the defined State A and State B. The color scale for the alanine dipeptide plot represents the MD-based committor value while the scale for chignolin indicates the machine-learned committor value kang2024computing.
  • Figure 4: Detailed performance of ASTRA on alanine dipeptide. a) The ASTRA samples covering all three TSs are compared to the Nudged Elastic Band reference method. One ASTRA sample is selected randomly for each TS for readability. The background shows the training data distribution and colors represent the two defined states for the classifier free guidance training. The three-dimensional structures are overlayed for each identified TS. b) shows the distribution of committor value for the stable ASTRA samples peaked around 0.5. c-e) characterize how close ASTRA samples are from their Dimer-optimized counterpart. c) shows the distribution of number of iterations necessary to converge Dimer from the stable ASTRA samples. d) displays the Root Mean Square Deviation (RMSD) of ASTRA samples after Dimer optimization compared to before, while e) shows that difference in terms of the energy of the structures. We indicate the number of stable ASTRA samples, compared to the total number of structures drawn from our algorithm for that analysis, alongside the Dimer convergence rate from ASTRA stable samples. Stable ASTRA samples correspond to structures sampled by ASTRA for which running 2ps MD from is stable.
  • Figure 5: Visualization of chignolin transition state mechanisms. Representative structures from ASTRA-sampled TSdown and TSup (left) ensembles are overlaid with transparent tubes on the reference conformations from Ref. kang2024computing (right). The structural agreement validates our method's ability to resolve distinct folding pathways.
  • ...and 21 more figures