Table of Contents
Fetching ...

SMC-X: A Distributed Scalable Monte Carlo Simulation Method for Chemically Complex Alloys

Xianglin Liu, Kai Yang, Fanli Zhou, Pengxiang Xu

TL;DR

This work tackles the challenge of simulating chemically complex high-entropy alloys (HEAs) at large spatial and temporal scales by introducing distributed SMC-X, a Monte Carlo framework that leverages dynamical Link-Cells (LC) and Local Interaction Zones (LIZ) to expose parallelism on CPUs and GPUs. By mapping algorithmic parallelism to hardware, SMC-X achieves record-scale atomistic simulations, including $N_{atom}=1.28\times10^{11}$ on 32 GPUs and $N_{atom}=1.066\times10^{9}$ with $3\times10^{6}$ MC steps, enabling diffusion-controlled nanoprecipitate analysis via Lifshitz-Slyozov-Wagner theory. The study compares two ML-interaction schemes, EPI and qSRO, and demonstrates superior performance of EPI on GPUs while showing robust scaling with qSRO; large-scale simulations reveal Ti/Ni segregation and Ni/Ti-enriched $L1_2$ nanoprecipitates, connecting computational predictions with APT-like observations. Overall, SMC-X provides a framework for simulation-driven design of chemically complex materials by extending atomistic simulations to unprecedented spatial and temporal regimes, bridging experimental and theoretical perspectives.

Abstract

To predict the complex chemical evolution in multicomponent alloys, it is highly desirable to have accurate atomistic simulation methods capable of reaching sufficiently large spatial and temporal scales. In this work, we advance the recently proposed SMC-X method through distributed computation on either GPUs or CPUs, pushing both spatial and temporal scales of atomistic simulation of chemically complex alloys to previously inaccessible scales. This includes a record-breaking 128-billion-atom HEA system extending to the micrometer regime in space, and a 1-billion-atom HEA evolved over more than three million Monte Carlo swap steps, approaching the minute regime in time. We show that such large-scale simulations are essential for bridging the gap between experimental observations and theoretical predictions of the nanoprecipitate sizes in HEAs, based on analysis using the Lifshitz-Slyozov-Wagner (LSW) theory for diffusion-controlled coarsening. This work demonstrates the great potential of SMC-X for simulation-driven exploration of the chemical complexity in high-entropy materials at large spatial and temporal scales.

SMC-X: A Distributed Scalable Monte Carlo Simulation Method for Chemically Complex Alloys

TL;DR

This work tackles the challenge of simulating chemically complex high-entropy alloys (HEAs) at large spatial and temporal scales by introducing distributed SMC-X, a Monte Carlo framework that leverages dynamical Link-Cells (LC) and Local Interaction Zones (LIZ) to expose parallelism on CPUs and GPUs. By mapping algorithmic parallelism to hardware, SMC-X achieves record-scale atomistic simulations, including on 32 GPUs and with MC steps, enabling diffusion-controlled nanoprecipitate analysis via Lifshitz-Slyozov-Wagner theory. The study compares two ML-interaction schemes, EPI and qSRO, and demonstrates superior performance of EPI on GPUs while showing robust scaling with qSRO; large-scale simulations reveal Ti/Ni segregation and Ni/Ti-enriched nanoprecipitates, connecting computational predictions with APT-like observations. Overall, SMC-X provides a framework for simulation-driven design of chemically complex materials by extending atomistic simulations to unprecedented spatial and temporal regimes, bridging experimental and theoretical perspectives.

Abstract

To predict the complex chemical evolution in multicomponent alloys, it is highly desirable to have accurate atomistic simulation methods capable of reaching sufficiently large spatial and temporal scales. In this work, we advance the recently proposed SMC-X method through distributed computation on either GPUs or CPUs, pushing both spatial and temporal scales of atomistic simulation of chemically complex alloys to previously inaccessible scales. This includes a record-breaking 128-billion-atom HEA system extending to the micrometer regime in space, and a 1-billion-atom HEA evolved over more than three million Monte Carlo swap steps, approaching the minute regime in time. We show that such large-scale simulations are essential for bridging the gap between experimental observations and theoretical predictions of the nanoprecipitate sizes in HEAs, based on analysis using the Lifshitz-Slyozov-Wagner (LSW) theory for diffusion-controlled coarsening. This work demonstrates the great potential of SMC-X for simulation-driven exploration of the chemical complexity in high-entropy materials at large spatial and temporal scales.

Paper Structure

This paper contains 10 sections, 11 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: A schematic of the workflow to illustrate the main contributions. MLP: machine learning potential; LC: link-cell; LIZ: local interaction zone; T: temperature; SIMD: single-instruction multiple-data; SS: solid solution; SRO: short-range order; LRO: long-range order; MCINP: multicomponent intermetallic nanoparticle.
  • Figure 2: A schematic to illustrate the SMC-X algorithm, including link-cell (LC), local interaction zone (LIZ), and domain decomposition scheme, using the 2D square lattice as an example. LIZ+ represents sites needed for calculating the energy change due to possible MC trial. Adapted from Ref. liuNPJ2025
  • Figure 3: Illustration of the SMC-GPU and SMC-CPU algorithms. In SMC-GPU, the LIZ and LC degree of parallelism are allocated to the cuda.blocks and cuda.threads. In SMC-CPU, part of the LC degree of parallelism (x and y directions) are parallelized via MPI processes, while the other LC degree of parallelism, as well as the LIZ ones, are parallelized via OpenMP threads.
  • Figure 4: Comparison of different ML accelerated atomistic simulation methods: (a) shows the number of atoms that can be simulated per chip, (b) displays the computational throughput achievable per chip, and (c) comparison of the performance of our results with previous SOTA in terms of total $N_{atom}$ ($x$-axis), total throughput ($y$-axis), throughput per chip (color), and the number of chips (circles with radius proportional to $N_{Chip}^{1/3}$). See original data at Tab. \ref{['tab:Performance']}.
  • Figure 5: The strong and weak scaling of SMC-GPU and SMC-CPU with EPI and qSRO model. CCO represents computation-communication-overlap.
  • ...and 3 more figures