Table of Contents
Fetching ...

Targeting the partition function of chemically disordered materials with a generative approach based on inverse variational autoencoders

Maciej J. Karcz, Luca Messina, Eiji Kawasaki, Emeric Bourasseau

TL;DR

This work proposes a novel approach where generative machine learning is used to yield a representative set of configurations for accurate property evaluation and provide accurate estimations of atomic-scale properties with minimal computational cost.

Abstract

Computing atomic-scale properties of chemically disordered materials requires an efficient exploration of their vast configuration space. Traditional approaches such as Monte Carlo or Special Quasirandom Structures either entail sampling an excessive amount of configurations or do not ensure that the configuration space has been properly covered. In this work, we propose a novel approach where generative machine learning is used to yield a representative set of configurations for accurate property evaluation and provide accurate estimations of atomic-scale properties with minimal computational cost. Our method employs a specific type of variational autoencoder with inverse roles for the encoder and decoder, enabling the application of an unsupervised active learning scheme that does not require any initial training database. The model iteratively generates configuration batches, whose properties are computed with conventional atomic-scale methods. These results are then fed back into the model to estimate the partition function, repeating the process until convergence. We illustrate our approach by computing point-defect formation energies and concentrations in (U, Pu)O2 mixed-oxide fuels. In addition, the ML model provides valuable insights into the physical factors influencing the target property. Our method is generally applicable to explore other properties, such as atomic-scale diffusion coefficients, in ideally or non-ideally disordered materials like high-entropy alloys.

Targeting the partition function of chemically disordered materials with a generative approach based on inverse variational autoencoders

TL;DR

This work proposes a novel approach where generative machine learning is used to yield a representative set of configurations for accurate property evaluation and provide accurate estimations of atomic-scale properties with minimal computational cost.

Abstract

Computing atomic-scale properties of chemically disordered materials requires an efficient exploration of their vast configuration space. Traditional approaches such as Monte Carlo or Special Quasirandom Structures either entail sampling an excessive amount of configurations or do not ensure that the configuration space has been properly covered. In this work, we propose a novel approach where generative machine learning is used to yield a representative set of configurations for accurate property evaluation and provide accurate estimations of atomic-scale properties with minimal computational cost. Our method employs a specific type of variational autoencoder with inverse roles for the encoder and decoder, enabling the application of an unsupervised active learning scheme that does not require any initial training database. The model iteratively generates configuration batches, whose properties are computed with conventional atomic-scale methods. These results are then fed back into the model to estimate the partition function, repeating the process until convergence. We illustrate our approach by computing point-defect formation energies and concentrations in (U, Pu)O2 mixed-oxide fuels. In addition, the ML model provides valuable insights into the physical factors influencing the target property. Our method is generally applicable to explore other properties, such as atomic-scale diffusion coefficients, in ideally or non-ideally disordered materials like high-entropy alloys.
Paper Structure (15 sections, 27 equations, 14 figures, 2 tables)

This paper contains 15 sections, 27 equations, 14 figures, 2 tables.

Figures (14)

  • Figure 1: Schematic representation of various machine learning models: a) Classical feedforward neural network; b) Variational autoregressive network architecture from wu2019solving. The model trains by sampling its output distribution; c) Normalizing flow architecture. The major difference from variational autoencoders comes from using invertible functions to map input data to latent representation, requiring the dimensions of both $\bm{x}$ and $\bm{y}$ to match; d) Autoencoder architecture; e) Variational autoencoder architecture, where $\bm{y}$ is sampled from the encoder distribution instead of being directly transformed from $\bm{x}$; f) Inverse variational autoencoder architecture, similar to classical variational autoencoders but with the roles of encoder and decoder reversed. Here, the dimensions of input and output do not need to match, allowing the sampling of high-dimensional $\bm{x}$ by sampling low-dimensional $\bm{y}$ from, for example, a Bernoulli distribution. Figure inspired by al2022qualitativenormalizingXLippeanwar2021difference.
  • Figure 2: Schematic visualization of the IVAE model's unsupervised training loop. Initially, the model parameters are randomly initialized. At each iteration, we sample $\bm{y}$ from $P(\bm{y})$ and then atomic configurations $\bm{x}_\mathrm{c}$ from $R_{\bm{\phi}}(\bm{x}_\mathrm{c}|\bm{y})$, compute the loss function, and update the model weights. This process is repeated until convergence.
  • Figure 3: General architecture of the used IVAE models. Samples from the input $P(\bm{y})$ distribution are represented with $y$. $\phi$ corresponds to the predicted parameters of $R_{\bm{\phi}}$ distribution from Eq. (\ref{['eq:Rxy_phi']}). $x'_c$ are the samples from the Gumbel softmax distribution from the original $R_{\bm{\phi}}$ and $\theta$ are the predicted parameters of $Q_{\bm{\theta}}$ distribution from Eq. (\ref{['eq:Qyx_theta']}). After the model's training is finished, $R_{\bm{\phi}}$ can be used to, e.g. generate atomic configurations. In principle, the model can target different partition functions, thus allowing for sampling other distributions from different systems, e.g. Ising spin systems.
  • Figure 4: The training process of the configuration probability part of the IVAE lower bound. The experiment was done for T = 500 K, $|\bm{y}| = 8$, for a global $y_{\mathrm{Pu}}$ concentration equal 10%. $\bar{\bm{x}}_\mathrm{c}$ indicates that the configuration probability part of the IVAE lower bound was computed as the average value over configurations generated during one training step. $n$ is a normalizing factor that corresponds to the size of the system, which in this case is equal to the number of atoms in the 2nn sphere of influence (18). Each step of the training consisted of generating 50 2nn atomic configurations , with BSD3 type of defect. Three different values of $\epsilon$ were added to the $\bm{\phi}$ parameters of $R_{\bm{\phi}}$ distribution from Eq. (\ref{['eq:lnZ_fq_Jensen_loss']}) as explained by Eq. (\ref{['eq:phi_constant']}).
  • Figure 5: IVAE BSD3 defect concentration for 1nn and 2nn sphere of influence in MOX 10% Pu. Exact computations were performed using the 1nn BSD3 database from karcz2023semi and the 2nn BSD3 database developed in this study. The computations utilized the formula given by Eq. (\ref{['eq:CdT_concentration_compressed']}), with $y_{\mathrm{Pu}} = 0.1$ from Eq. (\ref{['eq:p_prime_concentration']}). Predictions were generated by models achieving a minimum accuracy of 96.5%, determined by comparing predicted defect concentrations to analytically computed values. For both 1nn and 2nn systems, $|\bm{y}| = 8$ was used. The parameters $\bm{\phi}$ of the predicted $R_{\bm{\phi}}$ distribution were initialized using the formula provided by Eq. (\ref{['eq:logit_y']}).
  • ...and 9 more figures