Table of Contents
Fetching ...

Evaluating Mechanical Property Prediction across Material Classes using Molecular Dynamics Simulations with Universal Machine-Learned Interatomic Potentials

Konstantin Stracke, Connor W. Edwards, Jack D. Evans

Abstract

We assess the accuracy of six universal machine-learned interatomic potentials (MLIPs) for predicting the temperature and pressure response of materials by molecular dynamics simulations. Accuracy is evaluated across 13 diverse materials (nine metal-organic frameworks and four inorganic compounds), computing bulk modulus, thermal expansion, and thermal decomposition. These MLIPs employ three different architectures (graph neural networks, graph network simulators, and graph transformers) with varying training datasets. We observe qualitative accuracy across these predictions but systematic underestimation of bulk modulus and overestimation of thermal expansion across all models, consistent with potential energy surface softening. From all tested models, three top performers arise; `MACE-MP-0a', `fairchem_OMAT', and `Orb-v3', with average error across metrics and materials of 41%, 44%, and 47%, respectively. Despite strong overall performance, questions arise about the limits of model transferability: dataset homogeneity and structural representation dominate model accuracy. Our results show that certain architectures can compensate for biases, a step closer to truly universal MLIPs.

Evaluating Mechanical Property Prediction across Material Classes using Molecular Dynamics Simulations with Universal Machine-Learned Interatomic Potentials

Abstract

We assess the accuracy of six universal machine-learned interatomic potentials (MLIPs) for predicting the temperature and pressure response of materials by molecular dynamics simulations. Accuracy is evaluated across 13 diverse materials (nine metal-organic frameworks and four inorganic compounds), computing bulk modulus, thermal expansion, and thermal decomposition. These MLIPs employ three different architectures (graph neural networks, graph network simulators, and graph transformers) with varying training datasets. We observe qualitative accuracy across these predictions but systematic underestimation of bulk modulus and overestimation of thermal expansion across all models, consistent with potential energy surface softening. From all tested models, three top performers arise; `MACE-MP-0a', `fairchem_OMAT', and `Orb-v3', with average error across metrics and materials of 41%, 44%, and 47%, respectively. Despite strong overall performance, questions arise about the limits of model transferability: dataset homogeneity and structural representation dominate model accuracy. Our results show that certain architectures can compensate for biases, a step closer to truly universal MLIPs.

Paper Structure

This paper contains 4 sections, 2 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Summary of the bulk modulus ($K_T$), volumetric thermal expansion coefficient ($\alpha_V$), and decomposition temperature for all materials, as predicted by the models. Corresponding literature values (Table \ref{['tab:references']}) are highlighted for direct comparison.
  • Figure 2: Prediction deviations for bulk modulus $K_T$, CTE $\alpha_V$, and decomposition temperature across models, with materials indicated in the legend. Materials for which no reference value is available are omitted.
  • Figure 3: Mean absolute error (MAE, $\%$) of all models across bulk modulus $K_T$, CTE $\alpha_V$, decomposition temperature, and efficiency. The average accuracy across metrics (excluding efficiency) is indicated.
  • Figure 4: Model performance for predicting bulk modulus ($K_T$), CTE ($\alpha_V$), and decomposition temperature across selected material subsets. a) Orb-v3 and fairchem_OMAT evaluated on isotropic (iso.) and anisotropic (aniso.) MOFs. b) MACE-MOF and fairchem_OMAT evaluated on zinc-based MOFs separated by linker type (Zn carboxylate vs. Zn azolate). Connecting curves added to aid visual interpretation.