Coarse-Grained Boltzmann Generators

Weilong Chen; Bojun Zhao; Jan Eckwert; Julija Zavadlav

Coarse-Grained Boltzmann Generators

Weilong Chen, Bojun Zhao, Jan Eckwert, Julija Zavadlav

TL;DR

Coarse-Grained Boltzmann Generators (CG-BGs) address the longstanding challenge of unbiased equilibrium sampling in large molecular systems by operating in coarse-grained coordinates and using a learned potential of mean force (PMF) for exact reweighting. The method combines a flow-based proposal with an energy target U_η(R) learned via Enhanced Sampling Force Matching (ESFM), enabling reweighting to the true Boltzmann distribution p(R) ∝ e^{−βU(R)} even when training data come from biased or rapidly converged simulations. By reducing dimensionality, CG-BGs achieve scalable sampling while preserving thermodynamic consistency, effectively capturing solvent-mediated and many-body effects in reduced representations. A simulation-free evaluation capability for learned PMFs allows rapid benchmarking of PMFs without running new MD simulations, and the approach demonstrates favorable accuracy–efficiency trade-offs across coarse-graining resolutions, including explicit solvent contexts.

Abstract

Sampling equilibrium molecular configurations from the Boltzmann distribution is a longstanding challenge. Boltzmann Generators (BGs) address this by combining exact-likelihood generative models with importance sampling, but their practical scalability is limited. Meanwhile, coarse-grained surrogates enable the modeling of larger systems by reducing effective dimensionality, yet often lack the reweighting process required to ensure asymptotically correct statistics. In this work, we propose Coarse-Grained Boltzmann Generators (CG-BGs), a principled framework that unifies scalable reduced-order modeling with the exactness of importance sampling. CG-BGs act in a coarse-grained coordinate space, using a learned potential of mean force (PMF) to reweight samples generated by a flow-based model. Crucially, we show that this PMF can be efficiently learned from rapidly converged data via force matching. Our results demonstrate that CG-BGs faithfully capture complex interactions mediated by explicit solvent within highly reduced representations, establishing a scalable pathway for the unbiased sampling of larger molecular systems.

Coarse-Grained Boltzmann Generators

TL;DR

Abstract

Paper Structure (39 sections, 5 theorems, 36 equations, 14 figures, 6 tables, 3 algorithms)

This paper contains 39 sections, 5 theorems, 36 equations, 14 figures, 6 tables, 3 algorithms.

Introduction
Background and Preliminaries
Boltzmann Generators and Emulators
Continuous Normalizing Flows
Coarse-Graining and Potentials of Mean Force
Coarse-Grained Boltzmann Generators
Variational Force Matching
Enhanced Sampling for Force Matching
The CG-BG Workflow
Experiments
Recovering Equilibrium Distributions from Biased and Unbiased Data
Effect of Coarse-Graining Resolution on Accuracy and Efficiency
Simulation-Free Evaluation of Learned PMFs
Related Work
Conclusion
...and 24 more sections

Key Result

Proposition 1

Let $p^*(\mathbf{R}) \propto e^{-\beta U^*(\mathbf{R})}$ be the true marginal and $p_\eta(\mathbf{R}) \propto e^{-\beta U_\eta(\mathbf{R})}$ the learned distribution. If $p^*$ satisfies a Logarithmic Sobolev Inequality (LSI) with constant $\rho > 0$. Then, the Kullback-Leibler divergence between the

Figures (14)

Figure 1: CG-BG workflow. (1) Training data are collected and mapped from atomistic configurations to CG beads. (2) A PMF network learns $U_\eta(\mathbf{R})$ from rapidly converged data, while a normalizing flow learns a proposal density $q_\theta(\mathbf{R})$. (3) CG samples from flow models are reweighted with the PMF to recover the target distribution $p(\mathbf{R})$ and compute unbiased thermodynamic observables.
Figure 2: CG-BGs on the MB potential. (a) Two-dimensional MB potential energy surface (functional form in §\ref{['datasets']}). (b) Marginal probability density along the $x$ coordinate. (c) Free energy profiles before and after reweighting for CG-BGs, where flow is trained on unbiased data, compared with the exact solution and MD reference. (d) Same as (c), but for flow trained on biased data.
Figure 3: CG-BGs on alanine dipeptide (Heavy Atom). (a) Heavy Atom mapping. (b) Potential energy distributions under the learned PMF before and after reweighting, compared with the MD reference. (c) $\phi$ dihedral free energy profile before and after reweighting for CG-BGs, where flow is trained on 500 ns unbiased data, alongside the MD reference. (d) Same as (c), but for flow trained on a 10 ns WT-MetaD ($\gamma=1.5$) dataset.
Figure 4: CG-BGs on alanine dipeptide (Core Beta). (a) Core Beta mapping. (b) Potential energy distributions under the learned PMF before and after reweighting, compared with the MD reference. (c) $\phi$ dihedral free energy profile before and after reweighting for CG-BGs, where flow is trained on 500 ns unbiased data, alongside the MD reference. (d) Same as (c), but for flow trained on a 10 ns WT-MetaD ($\gamma=1.5$) dataset .
Figure 5: Simulation-free benchmarking of learned CG PMFs using CG-BGs (Heavy Atom). (a) Probability density of the $\phi$ dihedral angle after reweighting with PMFs trained on unbiased ($\text{PMF}_U$) and rapidly converged biased datasets ($\text{PMF}_B$), compared with the MD reference and flow proposal (trained on unbiased data). (b) Corresponding $\phi$ dihedral free energy profiles after reweighting, alongside the MD reference and flow proposal.
...and 9 more figures

Theorems & Definitions (7)

Proposition 1
Proposition 2
Proposition 3
Proposition 3
proof
Proposition 3
proof

Coarse-Grained Boltzmann Generators

TL;DR

Abstract

Coarse-Grained Boltzmann Generators

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (14)

Theorems & Definitions (7)