Table of Contents
Fetching ...

Hydrogen under Pressure as a Benchmark for Machine-Learning Interatomic Potentials

Thomas Bischoff, Bastian Jäckl, Matthias Rupp

TL;DR

A benchmark that automatically quantifies the performance of MLPs in MD simulations of a liquid-liquid phase transition in hydrogen under pressure is presented and it is shown that several state-of-the-art MLPs fail to reproduce the liquid-liquid phase transition.

Abstract

Machine-learning interatomic potentials (MLPs) are fast, data-driven surrogate models of atomistic systems' potential energy surfaces that can accelerate ab-initio molecular dynamics (MD) simulations by several orders of magnitude. The performance of MLPs is commonly measured as the prediction error in energies and forces on data not used in their training. While low prediction errors on a test set are necessary, they do not guarantee good performance in MD simulations. The latter requires physically motivated performance measures obtained from running accelerated simulations. However, the adoption of such measures has been limited by the effort and domain knowledge required to calculate and interpret them. To overcome this limitation, we present a benchmark that automatically quantifies the performance of MLPs in MD simulations of a liquid-liquid phase transition in hydrogen under pressure, a challenging benchmark system. The benchmark's h-llpt-24 dataset provides reference geometries, energies, forces, and stresses from density functional theory MD simulations at different temperatures and mass densities. The benchmark's Python code automatically runs MLP-accelerated MD simulations and calculates, quantitatively compares and visualizes pressures, stable molecular fractions, diffusion coefficients, and radial distribution functions. Employing this benchmark, we show that several state-of-the-art MLPs fail to reproduce the liquid-liquid phase transition.

Hydrogen under Pressure as a Benchmark for Machine-Learning Interatomic Potentials

TL;DR

A benchmark that automatically quantifies the performance of MLPs in MD simulations of a liquid-liquid phase transition in hydrogen under pressure is presented and it is shown that several state-of-the-art MLPs fail to reproduce the liquid-liquid phase transition.

Abstract

Machine-learning interatomic potentials (MLPs) are fast, data-driven surrogate models of atomistic systems' potential energy surfaces that can accelerate ab-initio molecular dynamics (MD) simulations by several orders of magnitude. The performance of MLPs is commonly measured as the prediction error in energies and forces on data not used in their training. While low prediction errors on a test set are necessary, they do not guarantee good performance in MD simulations. The latter requires physically motivated performance measures obtained from running accelerated simulations. However, the adoption of such measures has been limited by the effort and domain knowledge required to calculate and interpret them. To overcome this limitation, we present a benchmark that automatically quantifies the performance of MLPs in MD simulations of a liquid-liquid phase transition in hydrogen under pressure, a challenging benchmark system. The benchmark's h-llpt-24 dataset provides reference geometries, energies, forces, and stresses from density functional theory MD simulations at different temperatures and mass densities. The benchmark's Python code automatically runs MLP-accelerated MD simulations and calculates, quantitatively compares and visualizes pressures, stable molecular fractions, diffusion coefficients, and radial distribution functions. Employing this benchmark, we show that several state-of-the-art MLPs fail to reproduce the liquid-liquid phase transition.
Paper Structure (16 sections, 3 equations, 12 figures, 9 tables)

This paper contains 16 sections, 3 equations, 12 figures, 9 tables.

Figures (12)

  • Figure 1: Dataset coverage. Each disk represents a combination of Wigner-Seitz radius and temperature for which six simulations were performed, five for training of and one for testing.
  • Figure 2: Distribution of energies, forces, and pressures in the h-llpt-24 dataset. Shown are smoothed histograms of energies (left), forces (middle), and pressures (right) in the training (solid), validation (dashed), and test (dotted) subsets.
  • Figure 3: Derived properties based on the reference / simulations of the h-llpt-24 dataset. Shown are (a) pressure, (b) stable molecular fraction, and (c) diffusion coefficient as a function of the Wigner-Seitz radius $r_s$ and for different temperatures (color-coded), as well as (d) radial distribution functions at 1000 K for different Wigner-Seitz radii (color-coded).
  • Figure 4: Illustration of the Hellinger distance. Shown are (a) perfect, (b) intermediate, and (c) vanishing overlap of two one-dimensional distributions (MLP, DFT). The intersection is shown in gray.
  • Figure 5: Derived properties from -accelerated simulations. Shown are stable molecular fractions (top row) and diffusion coefficients (bottom row) for UFP2, UFP3, PACE, and MACE (left to right).
  • ...and 7 more figures