Table of Contents
Fetching ...

Progressive Inference-Time Annealing of Diffusion Models for Sampling from Boltzmann Densities

Tara Akhound-Sadegh, Jungyoon Lee, Avishek Joey Bose, Valentin De Bortoli, Arnaud Doucet, Michael M. Bronstein, Dominique Beaini, Siamak Ravanbakhsh, Kirill Neklyudov, Alexander Tong

TL;DR

This paper introduces Progressive Inference-Time Annealing (PITA), a diffusion-based framework for sampling from Boltzmann densities by jointly applying temperature annealing and diffusion path interpolation. By training a ladder of diffusion models from high to low temperatures and performing inference-time annealing via a novel Feynman-Kac PDE coupled with Sequential Monte Carlo, PITA achieves unbiased sampling at target temperatures with drastically reduced energy evaluations. The method combines denoising score matching, target score matching, and energy-based model distillation with energy pinning, enabling scalable Cartesian-coordinate sampling for LJ-13, Alanine Dipeptide, and Alanine Tripeptide, often surpassing state-of-the-art baselines. Empirical results show strong mode coverage, competitive or superior distribution metrics, and substantial amortization benefits, marking a significant advance in diffusion-based Boltzmann sampling for molecular systems.

Abstract

Sampling efficiently from a target unnormalized probability density remains a core challenge, with relevance across countless high-impact scientific applications. A promising approach towards this challenge is the design of amortized samplers that borrow key ideas, such as probability path design, from state-of-the-art generative diffusion models. However, all existing diffusion-based samplers remain unable to draw samples from distributions at the scale of even simple molecular systems. In this paper, we propose Progressive Inference-Time Annealing (PITA), a novel framework to learn diffusion-based samplers that combines two complementary interpolation techniques: I.) Annealing of the Boltzmann distribution and II.) Diffusion smoothing. PITA trains a sequence of diffusion models from high to low temperatures by sequentially training each model at progressively higher temperatures, leveraging engineered easy access to samples of the temperature-annealed target density. In the subsequent step, PITA enables simulating the trained diffusion model to procure training samples at a lower temperature for the next diffusion model through inference-time annealing using a novel Feynman-Kac PDE combined with Sequential Monte Carlo. Empirically, PITA enables, for the first time, equilibrium sampling of N-body particle systems, Alanine Dipeptide, and tripeptides in Cartesian coordinates with dramatically lower energy function evaluations. Code available at: https://github.com/taraak/pita

Progressive Inference-Time Annealing of Diffusion Models for Sampling from Boltzmann Densities

TL;DR

This paper introduces Progressive Inference-Time Annealing (PITA), a diffusion-based framework for sampling from Boltzmann densities by jointly applying temperature annealing and diffusion path interpolation. By training a ladder of diffusion models from high to low temperatures and performing inference-time annealing via a novel Feynman-Kac PDE coupled with Sequential Monte Carlo, PITA achieves unbiased sampling at target temperatures with drastically reduced energy evaluations. The method combines denoising score matching, target score matching, and energy-based model distillation with energy pinning, enabling scalable Cartesian-coordinate sampling for LJ-13, Alanine Dipeptide, and Alanine Tripeptide, often surpassing state-of-the-art baselines. Empirical results show strong mode coverage, competitive or superior distribution metrics, and substantial amortization benefits, marking a significant advance in diffusion-based Boltzmann sampling for molecular systems.

Abstract

Sampling efficiently from a target unnormalized probability density remains a core challenge, with relevance across countless high-impact scientific applications. A promising approach towards this challenge is the design of amortized samplers that borrow key ideas, such as probability path design, from state-of-the-art generative diffusion models. However, all existing diffusion-based samplers remain unable to draw samples from distributions at the scale of even simple molecular systems. In this paper, we propose Progressive Inference-Time Annealing (PITA), a novel framework to learn diffusion-based samplers that combines two complementary interpolation techniques: I.) Annealing of the Boltzmann distribution and II.) Diffusion smoothing. PITA trains a sequence of diffusion models from high to low temperatures by sequentially training each model at progressively higher temperatures, leveraging engineered easy access to samples of the temperature-annealed target density. In the subsequent step, PITA enables simulating the trained diffusion model to procure training samples at a lower temperature for the next diffusion model through inference-time annealing using a novel Feynman-Kac PDE combined with Sequential Monte Carlo. Empirically, PITA enables, for the first time, equilibrium sampling of N-body particle systems, Alanine Dipeptide, and tripeptides in Cartesian coordinates with dramatically lower energy function evaluations. Code available at: https://github.com/taraak/pita

Paper Structure

This paper contains 31 sections, 5 theorems, 41 equations, 13 figures, 12 tables, 1 algorithm.

Key Result

Proposition 1

[Inference-time Annealing] Annealed density of the energy-based model $q_t(x) \propto \exp\mathopen{}\mathclose{\left(-\gamma U_{t}(x;\eta)\right)$ matches the marginal densities of the following SDE

Figures (13)

  • Figure 1: Illustration of the proposed PITA framework combining two complementary processes: temperature annealing of the target Boltzmann density and the diffusion process applied to the collected samples. Annealed inference allows for decreasing the temperature (increasing $\beta$) of a trained diffusion model, thus generating samples from the annealed target. These samples can be reused for training a lower-temperature diffusion model.
  • Figure 2: LJ-13 sampling task. We compare the distribution of the interatomic distances and energy of the particles in the MCMC dataset (ground-truth), samples generated using a PITA model, and TA-BG progressively trained from high temperature to sample from the target distribution.
  • Figure 3: Molecular conformation sampling tasks. We compare the energy distribution of the ground-truth MD dataset and the samples generated using different models at $300$K. We use 30k samples for the plots.
  • Figure 4: TICA plots for Alanine Dipeptide (ALDP) at 300K obtained from different methods using 30k samples. Each panel shows the free energy landscape along the top two TICA components which capture the dominant slow transitions in the system.
  • Figure 5: TICA plots for Alanine Tripeptide (AL3) at 300K obtained from different methods using 30k samples.
  • ...and 8 more figures

Theorems & Definitions (8)

  • Proposition 1
  • Proposition 2
  • Proposition 2
  • proof
  • Proposition 2
  • proof
  • Proposition 3
  • proof