Table of Contents
Fetching ...

A Kokkos-Accelerated Moment Tensor Potential Implementation for LAMMPS

Zijian Meng, Karim Zongo, Edmanuel Torres, Christopher Maxwell, Ryan Eric Grant, Laurent Karim Béland

TL;DR

The paper addresses scaling machine-learning interatomic potentials, specifically the Moment Tensor Potential (MTP), to large-scale simulations on heterogeneous HPC hardware. It introduces a Kokkos-enabled MTP implementation for LAMMPS with nine variants across three use cases (inference, configuration-mode, neighborhood-mode) and CPU/GPU execution modes. Benchmark results show strong weak/strong scaling and substantial speedups, including up to $2\times$ on CPUs and efficient thread- and block-parallel GPU variants, while preserving uncertainty quantification and active-learning capabilities. Collectively, this work enables million-atom simulations and on-the-fly active learning on accessible HPC platforms, broadening the practical reach of MTPs in materials modeling.

Abstract

We present a Kokkos-accelerated implementation of the Moment Tensor Potential (MTP) for LAMMPS, designed to improve both computational performance and portability across CPUs and GPUs. This package introduces an optimized CPU variant--achieving up to 2x speedups over existing implementations--and two new GPU variants: a thread-parallel version for large-scale simulations and a block-parallel version optimized for smaller systems. It supports three core functionalities: standard inference, configuration-mode active learning, and neighborhood-mode active learning. Benchmarks and case studies demonstrate efficient scaling to million-atom systems, substantially extending accessible length and time scales while preserving the MTP's near-quantum accuracy and native support for uncertainty quantification.

A Kokkos-Accelerated Moment Tensor Potential Implementation for LAMMPS

TL;DR

The paper addresses scaling machine-learning interatomic potentials, specifically the Moment Tensor Potential (MTP), to large-scale simulations on heterogeneous HPC hardware. It introduces a Kokkos-enabled MTP implementation for LAMMPS with nine variants across three use cases (inference, configuration-mode, neighborhood-mode) and CPU/GPU execution modes. Benchmark results show strong weak/strong scaling and substantial speedups, including up to on CPUs and efficient thread- and block-parallel GPU variants, while preserving uncertainty quantification and active-learning capabilities. Collectively, this work enables million-atom simulations and on-the-fly active learning on accessible HPC platforms, broadening the practical reach of MTPs in materials modeling.

Abstract

We present a Kokkos-accelerated implementation of the Moment Tensor Potential (MTP) for LAMMPS, designed to improve both computational performance and portability across CPUs and GPUs. This package introduces an optimized CPU variant--achieving up to 2x speedups over existing implementations--and two new GPU variants: a thread-parallel version for large-scale simulations and a block-parallel version optimized for smaller systems. It supports three core functionalities: standard inference, configuration-mode active learning, and neighborhood-mode active learning. Benchmarks and case studies demonstrate efficient scaling to million-atom systems, substantially extending accessible length and time scales while preserving the MTP's near-quantum accuracy and native support for uncertainty quantification.

Paper Structure

This paper contains 10 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: A log-log plot of the inference simulation rate versus the atom count of several MTP implementations on various hardware for selected MTP levels. Separate simulations are performed for 100 timesteps for each atom count and method, and the best of five (Bo5) simulation rate is reported. 1 fs timestep is used.
  • Figure 3: The active learning speedups (relative maximum throughput) for both configuration and neighborhood mode over previous the MLIP-3 implementation.
  • Figure 4: Shearing of a C2 core type screw dislocation in silicon (115 thousand atoms, $5\times10^7$ s$^{-1}$ strain rate, 1 ns, 1 fs timestep). Top-Left: unstrained. Bottom-Left: strained. Right: engineering shear stress-strain curve.
  • Figure 5: Nanocrystalline tension of aluminum (1.00 million atoms, $10^8$ s$^{-1}$ strain rate, 1 ns, 1 fs timestep). Left: unstrained. Center: strained (0.1 strain). Right: engineering stress-strain curve.
  • Figure 6: A eutectic sodium (red) and potassium (blue) solid-liquid interface which may be out-of-distribution and whose uncertainty was thus tested with active learning enabled. Solid (C14 + BCC) pictured left; liquid pictured right.