Table of Contents
Fetching ...

BNEM: A Boltzmann Sampler Based on Bootstrapped Noised Energy Matching

RuiKang OuYang, Bo Qiang, José Miguel Hernández-Lobato

TL;DR

This work tackles efficient IID sampling from Boltzmann distributions by learning noised energies through diffusion (Noised Energy Matching, NEM) rather than denoising scores. Building on NEM, Bootstrap NEM (BNEM) uses bootstrapped, higher-noise targets from lower-noise energies to trade bias for variance, with a bi-level training loop and variance control. Theoretical analysis shows energy-based targets offer lower MC variance than score-based targets, and BNEM further reduces training variance while maintaining accuracy. Empirically, NEM and BNEM achieve state-of-the-art performance on multi-particle and high-dimensional benchmarks with greater robustness and efficiency than prior diffusion- or energy-based samplers, indicating strong potential for Boltzmann sampling in molecular and materials systems.

Abstract

Developing an efficient sampler capable of generating independent and identically distributed (IID) samples from a Boltzmann distribution is a crucial challenge in scientific research, e.g. molecular dynamics. In this work, we intend to learn neural samplers given energy functions instead of data sampled from the Boltzmann distribution. By learning the energies of the noised data, we propose a diffusion-based sampler, Noised Energy Matching, which theoretically has lower variance and more complexity compared to related works. Furthermore, a novel bootstrapping technique is applied to NEM to balance between bias and variance. We evaluate NEM and BNEM on a 2-dimensional 40 Gaussian Mixture Model (GMM) and a 4-particle double-well potential (DW-4). The experimental results demonstrate that BNEM can achieve state-of-the-art performance while being more robust.

BNEM: A Boltzmann Sampler Based on Bootstrapped Noised Energy Matching

TL;DR

This work tackles efficient IID sampling from Boltzmann distributions by learning noised energies through diffusion (Noised Energy Matching, NEM) rather than denoising scores. Building on NEM, Bootstrap NEM (BNEM) uses bootstrapped, higher-noise targets from lower-noise energies to trade bias for variance, with a bi-level training loop and variance control. Theoretical analysis shows energy-based targets offer lower MC variance than score-based targets, and BNEM further reduces training variance while maintaining accuracy. Empirically, NEM and BNEM achieve state-of-the-art performance on multi-particle and high-dimensional benchmarks with greater robustness and efficiency than prior diffusion- or energy-based samplers, indicating strong potential for Boltzmann sampling in molecular and materials systems.

Abstract

Developing an efficient sampler capable of generating independent and identically distributed (IID) samples from a Boltzmann distribution is a crucial challenge in scientific research, e.g. molecular dynamics. In this work, we intend to learn neural samplers given energy functions instead of data sampled from the Boltzmann distribution. By learning the energies of the noised data, we propose a diffusion-based sampler, Noised Energy Matching, which theoretically has lower variance and more complexity compared to related works. Furthermore, a novel bootstrapping technique is applied to NEM to balance between bias and variance. We evaluate NEM and BNEM on a 2-dimensional 40 Gaussian Mixture Model (GMM) and a 4-particle double-well potential (DW-4). The experimental results demonstrate that BNEM can achieve state-of-the-art performance while being more robust.
Paper Structure (50 sections, 9 theorems, 92 equations, 13 figures, 10 tables, 2 algorithms)

This paper contains 50 sections, 9 theorems, 92 equations, 13 figures, 10 tables, 2 algorithms.

Key Result

Proposition 3.1

If $\exp(-\mathcal{E}(x_{0|t}^{(i)}))$ is sub-Gaussian, then with probability $1 - \delta$ over $x_{0|t}^{(i)} \sim \mathcal{N}(x_t, \sigma_t^2)$, we have where $v_{0t}(x_t)=\mathrm{Var}_{\mathcal{N}(x;x_t, \sigma_t^2I)}[\exp(-\mathcal{E}_t(x))]$.

Figures (13)

  • Figure 1: Both NEM and BNEM parameterize a time-dependent energy network $E_\theta(x_t, t)$ to target the energies of noised data. NEM targets an MC energy estimator computed from the target energy function; BNEM targets a Bootstrap energy estimator computed from learned energy functions at a slightly lower noise level. Contours are the ground truth energies at different noise levels; $\bullet$ represents samples used for computing the MC energy estimator, $\bullet$ represents samples used for computing the Bootstrap energy estimator, and the white contour line represents the learned energy at time $u$.
  • Figure 2: Sampled points from samplers applied to GMM-40 potentials, with the ground truth represented by contour lines. For diffusion-based methods, the reverse SDE integration steps are limited to 100.
  • Figure 3: Histogram for energy (top) and interatomic distance (bottom) of generated samples.
  • Figure 4: Barplots comparing iDEM, NEM, and BNEM evaluations with $1000$ vs. $100$ integration steps and MC samples on the LJ-13 benchmark.
  • Figure 5: Num. of energy evaluation v.s. $\mathcal{E}\text{-}\mathcal{W}_2$. Left: GMM; Right: LJ-55.
  • ...and 8 more figures

Theorems & Definitions (9)

  • Proposition 3.1
  • Corollary 3.2
  • Proposition 3.3
  • Proposition 3.4
  • Proposition B.1
  • Corollary B.2
  • Proposition C.1
  • Proposition F.1
  • Proposition G.1