Table of Contents
Fetching ...

Scaling Kinetic Monte-Carlo Simulations of Grain Growth with Combined Convolutional and Graph Neural Networks

Zhihui Tian, Ethan Suwandi, Tomas Oppelstrup, Vasily V. Bulatov, Joel B. Harley, Fei Zhou

TL;DR

The paper tackles scaling kinetic Monte Carlo simulations of grain growth to realistic large-scale domains by integrating a CNN-based bijective autoencoder with a GNN operating in latent space. The method yields dramatic memory and runtime reductions, achieving up to 117x memory and 115x runtime savings on a $160^3$ mesh relative to a GNN-only baseline while preserving or improving accuracy. Key ideas include lossless spatial compression, reduced GNN depth, multi-step training on stochastic PMC data, and latent-space rollouts for efficient long-term predictions. This scalable surrogate enables realistic long-time grain-growth simulations and provides a blueprint for applying neural surrogates to complex grain boundary networks.

Abstract

Graph neural networks (GNN) have emerged as a promising machine learning method for microstructure simulations such as grain growth. However, accurate modeling of realistic grain boundary networks requires large simulation cells, which GNN has difficulty scaling up to. To alleviate the computational costs and memory footprint of GNN, we propose a hybrid architecture combining a convolutional neural network (CNN) based bijective autoencoder to compress the spatial dimensions, and a GNN that evolves the microstructure in the latent space of reduced spatial sizes. Our results demonstrate that the new design significantly reduces computational costs with using fewer message passing layer (from 12 down to 3) compared with GNN alone. The reduction in computational cost becomes more pronounced as the spatial size increases, indicating strong computational scalability. For the largest mesh evaluated (160^3), our method reduces memory usage and runtime in inference by 117x and 115x, respectively, compared with GNN-only baseline. More importantly, it shows higher accuracy and stronger spatiotemporal capability than the GNN-only baseline, especially in long-term testing. Such combination of scalability and accuracy is essential for simulating realistic material microstructures over extended time scales. The improvements can be attributed to the bijective autoencoder's ability to compress information losslessly from spatial domain into a high dimensional feature space, thereby producing more expressive latent features for the GNN to learn from, while also contributing its own spatiotemporal modeling capability. The training was optimized to learn from the stochastic Potts Monte Carlo method. Our findings provide a highly scalable approach for simulating grain growth.

Scaling Kinetic Monte-Carlo Simulations of Grain Growth with Combined Convolutional and Graph Neural Networks

TL;DR

The paper tackles scaling kinetic Monte Carlo simulations of grain growth to realistic large-scale domains by integrating a CNN-based bijective autoencoder with a GNN operating in latent space. The method yields dramatic memory and runtime reductions, achieving up to 117x memory and 115x runtime savings on a mesh relative to a GNN-only baseline while preserving or improving accuracy. Key ideas include lossless spatial compression, reduced GNN depth, multi-step training on stochastic PMC data, and latent-space rollouts for efficient long-term predictions. This scalable surrogate enables realistic long-time grain-growth simulations and provides a blueprint for applying neural surrogates to complex grain boundary networks.

Abstract

Graph neural networks (GNN) have emerged as a promising machine learning method for microstructure simulations such as grain growth. However, accurate modeling of realistic grain boundary networks requires large simulation cells, which GNN has difficulty scaling up to. To alleviate the computational costs and memory footprint of GNN, we propose a hybrid architecture combining a convolutional neural network (CNN) based bijective autoencoder to compress the spatial dimensions, and a GNN that evolves the microstructure in the latent space of reduced spatial sizes. Our results demonstrate that the new design significantly reduces computational costs with using fewer message passing layer (from 12 down to 3) compared with GNN alone. The reduction in computational cost becomes more pronounced as the spatial size increases, indicating strong computational scalability. For the largest mesh evaluated (160^3), our method reduces memory usage and runtime in inference by 117x and 115x, respectively, compared with GNN-only baseline. More importantly, it shows higher accuracy and stronger spatiotemporal capability than the GNN-only baseline, especially in long-term testing. Such combination of scalability and accuracy is essential for simulating realistic material microstructures over extended time scales. The improvements can be attributed to the bijective autoencoder's ability to compress information losslessly from spatial domain into a high dimensional feature space, thereby producing more expressive latent features for the GNN to learn from, while also contributing its own spatiotemporal modeling capability. The training was optimized to learn from the stochastic Potts Monte Carlo method. Our findings provide a highly scalable approach for simulating grain growth.

Paper Structure

This paper contains 10 sections, 5 equations, 12 figures, 4 tables.

Figures (12)

  • Figure 1: Architecture of the ML models. (a) shows the model structures, and (b) presents the corresponding pseudocode. The detailed network structure is given in Fig. \ref{['fig:NN-concept']}.
  • Figure 2: Effects of autoencoder (AE) compression ratios on training and inference computational costs and prediction accuracy on an AMD MI300A APU with 3 message-passing layers in the GNN. Memory usage comparison for (a) 2D system of $64^2$ mesh and (b) 3D with $32^3$ mesh; Run time for (c) 2D and (d) 3D; validation RMSE for (e) 2D and (f) 3D.
  • Figure 3: Effects of number of message passing layer on AE+GNN model. Statistical metrics of 3D grain simulations for AE+GNN based on 40 independent predicted and ground-truth trajectories using a $32^3$ mesh. The first, second, and third rows correspond to models using 3, 5, 12 message passing layers in GNN, respectively. From left to right, the columns show the normalized grain diameter distribution, the number of grains, and the average grain area as a function of time.
  • Figure 4: Temporal extrapolation and microstructure visualization of 2D predictions on a $64^2$ mesh (compression ratio $n=4$), trained trajectories with 25 frames and inferred for 100 frames in original and latent spaces using 3 MP layer, demonstrating temporal extrapolation. The rows are, from the top, ground truth PMC data, predictions using algorithm 2, and algorithm 3 (latent-space inference).
  • Figure 5: Spatiotemporal extrapolation and microstructure visualization in 3D. (a) Predictions trained on $32^3$ mesh with 25 frames and inferred on a $96^3$ mesh with 200 frames in GNN, AE+GNN (compression ratio 8 or $n=2$) in original and latent spaces using 3 MP layers, demonstrating spatiotemporal extrapolation. The GNN baseline performs only a few inference steps before divergence, indicated by the red rectangle. (b) Statistics of 6 independent predicted and ground-truth trajectories on $96^3$ mesh for AE+GNN trained on $32^3$ mesh
  • ...and 7 more figures