Table of Contents
Fetching ...

Adjoint Sampling: Highly Scalable Diffusion Samplers via Adjoint Matching

Aaron Havens, Benjamin Kurt Miller, Bing Yan, Carles Domingo-Enrich, Anuroop Sriram, Brandon Wood, Daniel Levine, Bin Hu, Brandon Amos, Brian Karrer, Xiang Fu, Guan-Horng Liu, Ricky T. Q. Chen

TL;DR

This paper tackles the challenge of sampling from unnormalized energy-based distributions in high dimensions. It introduces Adjoint Sampling, an on-policy diffusion method rooted in stochastic optimal control, and builds on Adjoint Matching with a Reciprocal Projection to enable many gradient updates per energy evaluation. The approach leverages a Schrödinger-bridge formulation, RAM/AM objectives, and a decoupled replay-buffer training scheme, augmented with geometric extensions for SE(3) invariance and periodic boundaries. Empirical results show strong performance on synthetic energy landscapes and scalable, amortized conformer generation for molecular systems, highlighting practical impact for computational chemistry and energy-based modeling. The authors also plan to open-source benchmarks to spur further advances in scalable sampling methods.

Abstract

We introduce Adjoint Sampling, a highly scalable and efficient algorithm for learning diffusion processes that sample from unnormalized densities, or energy functions. It is the first on-policy approach that allows significantly more gradient updates than the number of energy evaluations and model samples, allowing us to scale to much larger problem settings than previously explored by similar methods. Our framework is theoretically grounded in stochastic optimal control and shares the same theoretical guarantees as Adjoint Matching, being able to train without the need for corrective measures that push samples towards the target distribution. We show how to incorporate key symmetries, as well as periodic boundary conditions, for modeling molecules in both cartesian and torsional coordinates. We demonstrate the effectiveness of our approach through extensive experiments on classical energy functions, and further scale up to neural network-based energy models where we perform amortized conformer generation across many molecular systems. To encourage further research in developing highly scalable sampling methods, we plan to open source these challenging benchmarks, where successful methods can directly impact progress in computational chemistry.

Adjoint Sampling: Highly Scalable Diffusion Samplers via Adjoint Matching

TL;DR

This paper tackles the challenge of sampling from unnormalized energy-based distributions in high dimensions. It introduces Adjoint Sampling, an on-policy diffusion method rooted in stochastic optimal control, and builds on Adjoint Matching with a Reciprocal Projection to enable many gradient updates per energy evaluation. The approach leverages a Schrödinger-bridge formulation, RAM/AM objectives, and a decoupled replay-buffer training scheme, augmented with geometric extensions for SE(3) invariance and periodic boundaries. Empirical results show strong performance on synthetic energy landscapes and scalable, amortized conformer generation for molecular systems, highlighting practical impact for computational chemistry and energy-based modeling. The authors also plan to open-source benchmarks to spur further advances in scalable sampling methods.

Abstract

We introduce Adjoint Sampling, a highly scalable and efficient algorithm for learning diffusion processes that sample from unnormalized densities, or energy functions. It is the first on-policy approach that allows significantly more gradient updates than the number of energy evaluations and model samples, allowing us to scale to much larger problem settings than previously explored by similar methods. Our framework is theoretically grounded in stochastic optimal control and shares the same theoretical guarantees as Adjoint Matching, being able to train without the need for corrective measures that push samples towards the target distribution. We show how to incorporate key symmetries, as well as periodic boundary conditions, for modeling molecules in both cartesian and torsional coordinates. We demonstrate the effectiveness of our approach through extensive experiments on classical energy functions, and further scale up to neural network-based energy models where we perform amortized conformer generation across many molecular systems. To encourage further research in developing highly scalable sampling methods, we plan to open source these challenging benchmarks, where successful methods can directly impact progress in computational chemistry.

Paper Structure

This paper contains 73 sections, 5 theorems, 84 equations, 12 figures, 5 tables, 1 algorithm.

Key Result

Proposition 3.1

After projection eq:mark_projection, Reciprocal Adjoint Matching is equivalent to Adjoint Matching, and furthermore, this projection improves upon on the SOC objective,

Figures (12)

  • Figure 1: Starting from the uncontrolled diffusion process $p^{\text{base}}_1$ (left-most panel), Adjoint Sampling uses the Reciprocal Projection $(X_t, X_1) \sim p^{\text{base}}_{t|1}p^u_1$ of the current controlled SDE to approximate the joint trajectory distribution $p^u_t$, allowing us to take several gradient steps on the RAM objective per evaluated sample and energy gradient $(X_1$, $\nabla g(X_1))$. After several iterations, Adjoint Sampling converges to the target Boltzmann $\mu$ (right-most panel).
  • Figure 2: The iterations of Adjoint Sampling for amortized molecular conformer generation. SMILES strings condition the stochastic differential equation (SDE) on a specific molecular graph $\mathcal{G}^{(i)}$, where the final states and energy gradients and conditioning $(\nabla g^{(i)},X^{(i)}_1, \mathcal{G}^{(i)})$ are stored into a replay buffer. The model is trained by minimizing the RAM loss using the sample buffer, progressively transforming the samples into realistic molecular conformations.
  • Figure 3: The figure depicts two sampled trajectories from trained Adjoint Sampling models that use either the cartesian or the torsional representations. They target conformations of the held-out SMILES string COCSc1sc2ccccc2[n+]1[O-]. The left frame $X_0$ comes from the initial Dirac distribution and the right frame $X_1$ is a sampled conformer.
  • Figure 4: Recall coverage versus RMSD threshold for Adjoint Sampling variants and RDKit. We show performance both with and without relaxation.
  • Figure 5: Multiple representations of the molecule diiodoethane with SMILES string ICCI. (Left) Molecular graph representation. (Right) 3D coordinates representation. We chose the torsion angle as I-C-C-I, where I indicates a purple iodine atom and C indicates a gray carbon atom. We identify the torsions with a 4-tuple of indices, selecting the heaviest atoms on either side of the central bond as the first and last members. Ties are broken by arbitrary atomic index. We take the zero dihedral angle to be when first and last members of the 4-tuple are close, the so-called cis-isomer. Following jing2022torsional, we adjust the dihedral angle by rotating all the atoms at one end of the torsion about the bond axis, leaving the remaining atoms in place. This can be described as a torque pointing along the bond axis. When the torque would asymmetrically act on the molecule, we define the positive direction towards the side of the molecule with more atoms.
  • ...and 7 more figures

Theorems & Definitions (9)

  • Proposition 3.1
  • Theorem 3.2: Theoretical guarantees of Adjoint Sampling (informal)
  • Proposition A.1
  • proof
  • Definition C.1: Reciprocal class, Def. 3 of shi2024diffusion
  • Lemma C.2: Reciprocal projection characterization, Prop. 4 of shi2024diffusion
  • proof : Proof of \ref{['prop:projection']}
  • Theorem C.3: Adjoint Sampling
  • proof