Table of Contents
Fetching ...

Efficient mapping of phase diagrams with conditional Boltzmann Generators

Maximilian Schebek, Michele Invernizzi, Frank Noé, Jutta Rogal

TL;DR

Deep generative machine learning models based on the Boltzmann Generator approach for entire phase diagrams are developed, employing normalizing flows conditioned on the thermodynamic states, e.g. temperature and pressure, that they map to.

Abstract

The accurate prediction of phase diagrams is of central importance for both the fundamental understanding of materials as well as for technological applications in material sciences. However, the computational prediction of the relative stability between phases based on their free energy is a daunting task, as traditional free energy estimators require a large amount of simulation data to obtain uncorrelated equilibrium samples over a grid of thermodynamic states. In this work, we develop deep generative machine learning models based on the Boltzmann Generator approach for entire phase diagrams, employing normalizing flows conditioned on the thermodynamic states, e.g., temperature and pressure, that they map to. By training a single normalizing flow to transform the equilibrium distribution sampled at only one reference thermodynamic state to a wide range of target temperatures and pressures, we can efficiently generate equilibrium samples across the entire phase diagram. Using a permutation-equivariant architecture allows us, thereby, to treat solid and liquid phases on the same footing. We demonstrate our approach by predicting the solid-liquid coexistence line for a Lennard-Jones system in excellent agreement with state-of-the-art free energy methods while significantly reducing the number of energy evaluations needed.

Efficient mapping of phase diagrams with conditional Boltzmann Generators

TL;DR

Deep generative machine learning models based on the Boltzmann Generator approach for entire phase diagrams are developed, employing normalizing flows conditioned on the thermodynamic states, e.g. temperature and pressure, that they map to.

Abstract

The accurate prediction of phase diagrams is of central importance for both the fundamental understanding of materials as well as for technological applications in material sciences. However, the computational prediction of the relative stability between phases based on their free energy is a daunting task, as traditional free energy estimators require a large amount of simulation data to obtain uncorrelated equilibrium samples over a grid of thermodynamic states. In this work, we develop deep generative machine learning models based on the Boltzmann Generator approach for entire phase diagrams, employing normalizing flows conditioned on the thermodynamic states, e.g., temperature and pressure, that they map to. By training a single normalizing flow to transform the equilibrium distribution sampled at only one reference thermodynamic state to a wide range of target temperatures and pressures, we can efficiently generate equilibrium samples across the entire phase diagram. Using a permutation-equivariant architecture allows us, thereby, to treat solid and liquid phases on the same footing. We demonstrate our approach by predicting the solid-liquid coexistence line for a Lennard-Jones system in excellent agreement with state-of-the-art free energy methods while significantly reducing the number of energy evaluations needed.
Paper Structure (19 sections, 32 equations, 6 figures, 2 tables)

This paper contains 19 sections, 32 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Workflow conditional Boltzmann Generator. The prior equilibrium distribution $q_{{\bf c}_0}$ at a reference thermodynamic state ${\bf c}_0=(T_0,P_0)$ is sampled with MD. A BG conditioned on the thermodynamic variables transforms this distribution to approximate the target equilibrium distributions $q_{{\bf c}_i}$ at different thermodynamic states ${\bf c}_i$. The BG is trained by minimizing the conditional loss function $\mathcal{L(\theta)}$, given as expectation value over the conditional value ${\bf c}\sim p_{\bf c}$ of the single-point KL divergences (Eq. \ref{['eq:cond_loss_explicit']}.
  • Figure 2: Flow transformation. a, Schematic of the conditional transformation of one sample $f({\bf x},V)=({\bf x}',V')$ consisting of an atomic configuration ${\bf x}$ and a volume parameter $V$. The transformation of the box parameter is a simple isotropic scaling operation ($L'_i/L'_j\!=\!L_i/L_j$) learned by an MLP conditioned on $(T,P)$. The configuration ${\bf x}$ is scaled to fractional coordinates ${\bf s}$ and then transformed by a configurational coupling flow $f_{\bf s}$ conditioned on $T$, $P$, and $V'$. As a final step, the configuration is scaled back to real coordinates. b, Illustration of the configurational flow transformation, where each flow layer transforms a subset of cartesian coordinates of all particles. The dashed box is a sketch of one $(T,P,V')$- conditional coupling layer. The sets of fractional coordinates and spline parameters are denoted as $\{s\} = \{s_1,\dots, s_N\}$ and $\{\psi\} = \{\psi_1,\dots, \psi_N\}$, respectively (see Sec. \ref{['sec:layer']} for details). Dashed lines denote splitting / merging operations.
  • Figure 3: Training results for 180 Lennard-Jones particles in FCC phase. a, KL divergence (Eq. \ref{['eq:kl_div']}) between generated and target distributions at the evaluation state ${\bf c}_{\rm eval}$ obtained from a flow trained only on the evaluation state (orange) and from a flow trained in a conditional way (blue). Thermodynamic states ($T^*_0,P^*_0$) and ($T^*_{\rm eval},P^*_{\rm eval}$) are marked as blue cross and blue star, respectively, in b. Learning rate reductions were applied for the conditional flow after 30 and 45 epochs. The orange dashed lines marks the training time after which the sampling efficiency of the single-point flow was evaluated. b and c, Sampling efficiency (Eq. \ref{['eq:ess_kish']}) for conditional and single-point flow, respectively, in % evaluated over the range of thermodynamic states used for training the conditional flow. d, Deviation in $\Delta G^*$ (relative to ($T^*_0,P^*_0$)) of the conditional flow from MD+MBAR along the path defined by the black arrow in b. The dashed lines indicate one standard deviation evaluated over 10 flow runs, the shaded area correspond to the maximum standard deviation of 10 MD+MBAR runs. e and f, Radial distribution function of configurations from the prior, generated, and reweighted distributions of the conditional flow compared to MD at different thermodynamic states. g, 2D projections of 500 different configurations and mean box dimensions. The left-most panel contains samples from the prior distribution, the remaining panels show configurations generated for increasing temperature and decreasing pressure. For reference, prior configurations and mean box are superimposed to the generated configurations as orange shadows and dashed lines, respectively. In all plots, the fixed particle is located at (0,0).
  • Figure 4: Coexistence line a, Contourplot of $\Delta G^* = G^*_{\rm sol} - G^*_{\rm liq}$ per particle as obtained from the conditional BGs. Liquid-solid coexistence corresponds to $\Delta G^*\!=\!0$. The dashed line denotes the coexistence line obtained from a high-accuracy MD+MBAR run. The locations of solid and liquid priors are marked as blue and red circles, respectively. b, Deviation in the melting temperature of the conditional BG from MD+MBAR. The uncertainties of the BG predictions were obtained as the standard deviation over 10 independent runs. The shaded area indicates the maximum fluctuations over 10 MD+MBAR runs.
  • Figure S1: Sampling efficiency of the conditional flow for the fluid phase in %. The blue cross denotes the prior location $T^* = 1.4,P^* = 6$.
  • ...and 1 more figures