Transferable Boltzmann Generators

Leon Klein; Frank Noé

Transferable Boltzmann Generators

Leon Klein, Frank Noé

TL;DR

The paper tackles the enduring challenge of sampling equilibrium molecular ensembles by introducing transferable Boltzmann Generators (TBGs) built on continuous normalizing flows and flow matching. These models learn a transfer strategy across chemical space, enabling zero-shot Boltzmann sampling and efficient reweighting for unseen dipeptides, with architecture that encodes topology through equivariant graph networks. Empirical results on alanine dipeptide and 2AA dipeptides show superior effective sample sizes, accurate free-energy projections, and reliable metastable-state coverage for unseen systems, often with data-efficient training. This work suggests a path toward scalable, transferable, and fast Boltzmann sampling for small molecules, with potential extensions to larger systems and more expensive force fields while acknowledging limitations such as control over convergence and the need for topology validation during inference.

Abstract

The generation of equilibrium samples of molecular systems has been a long-standing problem in statistical physics. Boltzmann Generators are a generative machine learning method that addresses this issue by learning a transformation via a normalizing flow from a simple prior distribution to the target Boltzmann distribution of interest. Recently, flow matching has been employed to train Boltzmann Generators for small molecular systems in Cartesian coordinates. We extend this work and propose a first framework for Boltzmann Generators that are transferable across chemical space, such that they predict zero-shot Boltzmann distributions for test molecules without being retrained for these systems. These transferable Boltzmann Generators allow approximate sampling from the target distribution of unseen systems, as well as efficient reweighting to the target Boltzmann distribution. The transferability of the proposed framework is evaluated on dipeptides, where we show that it generalizes efficiently to unseen systems. Furthermore, we demonstrate that our proposed architecture enhances the efficiency of Boltzmann Generators trained on single molecular systems.

Transferable Boltzmann Generators

TL;DR

Abstract

Paper Structure (39 sections, 8 equations, 22 figures, 6 tables)

This paper contains 39 sections, 8 equations, 22 figures, 6 tables.

Introduction
Related work
Boltzmann Generators and Normalizing Flows
Boltzmann Generators
Continuous Normalizing Flows (CNFs)
Equivariant flows
Flow matching
Transferable Boltzmann Generators
Architecture
Training transferable Boltzmann Generators
Inference with transferable Boltzmann Generators
Experiments
Alanine dipeptide
Dipeptides (2AA)
Training on a biased training set
...and 24 more sections

Figures (22)

Figure 1: Results for the alanine dipeptide system simulated with a classical force field (a) Ramachandran plots for the biased MD distribution (left) and for samples generate with the TBG + full model (right). (b) Energies of samples generated with different methods. (c) Free energy projection along the slowest transition ($\varphi$ angle), computed with different methods.
Figure 2: Results for the KS dipeptide (a) Sample generated with the TBG + full model (b) Ramachandran plot for the weighted MD distribution (left) and for samples generate with the TBG + full model (right). (c) TICA plot for the weighted MD distribution (left) and for samples generate with the TBG + full model (right). (d) Energies of samples generated with different methods and architectures. (e) Free energy projection along the $\varphi$ angle. (f) Free energy projection along the slowest transition (TIC0).
Figure 3: Results for the GN dipeptide (a) Sample generated with the TBG + full model (b) Ramachandran plot for the weighted MD distribution (left) and for samples generate with the TBG + full model (right). (c) TICA plot for the weighted MD distribution (left) and for samples generate with the TBG + full model (right). (d) Energies of samples generated with different methods and architectures. (e) Free energy projection along the $\varphi$ angle. (f) Free energy projection along the slowest transition (TIC0).
Figure 4: (a) Effective samples sizes (ESS) for the first 8 test peptides for different transferable architectures and training sets. (b) Free energy projection along the $\varphi$ angle for the TBG + full model trained on the biased dataset for the KS dipeptide. The weighted free energy projection demonstrates a superior fit compared to the TBG + full model (see \ref{['fig:2AA']}e). (c) Free energy projection along the $\varphi$ angle for the TBG + full model trained on the biased dataset for the GN dipeptide. The weighted free energy projection demonstrates a superior fit compared to the TBG + full model (see \ref{['fig:2AAb']}e).
Figure 5: TBG / Timewarp MCMC klein2023equivariant sampling experiments. Wasserstein distance between the generated Ramachandran plot and the MD Ramachandran plot for different computational budgets for dipeptides. Lower is better. (a) After 30000 energy evaluations (b) After 12h wall-clock-time. (c) After 24h wall-clock-time.
...and 17 more figures

Transferable Boltzmann Generators

TL;DR

Abstract

Transferable Boltzmann Generators

Authors

TL;DR

Abstract

Table of Contents

Figures (22)