Table of Contents
Fetching ...

Cartesian atomic cluster expansion for machine learning interatomic potentials

Bingqing Cheng

TL;DR

The paper introduces Cartesian Atomic Cluster Expansion (CACE), a rotationally invariant, polynomially independent invariant feature set derived directly in Cartesian coordinates to replace spherical-harmonic-based expansions. By constructing an edge basis from radial, angular, and edge-type components and projecting onto atom-centered densities, CACE preserves body-order completeness while enabling efficient, independent evaluation of invariants and a lightweight, element-embedded representation. Across bulk water, small molecules (ethanol and 3BPA), and a 25-element high-entropy alloy dataset, CACE demonstrates competitive accuracy, strong MD stability up to high temperatures, and robust alchemical extrapolation to unseen elements. The approach offers a scalable, interpretable alternative to ACE and E(3)-equivariant MPNNs, with potential for pretraining and foundation-model-style application across the periodic table.

Abstract

Machine learning interatomic potentials are revolutionizing large-scale, accurate atomistic modelling in material science and chemistry. Many potentials use atomic cluster expansion or equivariant message passing frameworks. Such frameworks typically use spherical harmonics as angular basis functions, and then use Clebsch-Gordan contraction to maintain rotational symmetry, which may introduce redundancies in representations and computational overhead. We propose an alternative: a Cartesian-coordinates-based atomic density expansion. This approach provides a complete set of polynormially indepedent features of atomic environments while maintaining interaction body orders. Additionally, we integrate low-dimensional embeddings of various chemical elements and inter-atomic message passing. The resulting potential, named Cartesian Atomic Cluster Expansion (CACE), exhibits good accuracy, stability, and generalizability. We validate its performance in diverse systems, including bulk water, small molecules, and 25-element high-entropy alloys.

Cartesian atomic cluster expansion for machine learning interatomic potentials

TL;DR

The paper introduces Cartesian Atomic Cluster Expansion (CACE), a rotationally invariant, polynomially independent invariant feature set derived directly in Cartesian coordinates to replace spherical-harmonic-based expansions. By constructing an edge basis from radial, angular, and edge-type components and projecting onto atom-centered densities, CACE preserves body-order completeness while enabling efficient, independent evaluation of invariants and a lightweight, element-embedded representation. Across bulk water, small molecules (ethanol and 3BPA), and a 25-element high-entropy alloy dataset, CACE demonstrates competitive accuracy, strong MD stability up to high temperatures, and robust alchemical extrapolation to unseen elements. The approach offers a scalable, interpretable alternative to ACE and E(3)-equivariant MPNNs, with potential for pretraining and foundation-model-style application across the periodic table.

Abstract

Machine learning interatomic potentials are revolutionizing large-scale, accurate atomistic modelling in material science and chemistry. Many potentials use atomic cluster expansion or equivariant message passing frameworks. Such frameworks typically use spherical harmonics as angular basis functions, and then use Clebsch-Gordan contraction to maintain rotational symmetry, which may introduce redundancies in representations and computational overhead. We propose an alternative: a Cartesian-coordinates-based atomic density expansion. This approach provides a complete set of polynormially indepedent features of atomic environments while maintaining interaction body orders. Additionally, we integrate low-dimensional embeddings of various chemical elements and inter-atomic message passing. The resulting potential, named Cartesian Atomic Cluster Expansion (CACE), exhibits good accuracy, stability, and generalizability. We validate its performance in diverse systems, including bulk water, small molecules, and 25-element high-entropy alloys.
Paper Structure (7 sections, 15 equations, 5 figures, 5 tables)

This paper contains 7 sections, 15 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Schematic of the CACE potential. a-h show each step of the operation, and i illustrates the rule for making rotationally invariant features.
  • Figure 2: The total number of angular features $N_L$, for maximum values of $l_\mathrm{max}$ and $\nu_\mathrm{max}$.
  • Figure 3: Simulation results of water using CACE $T=1$ model. a Oxygen-oxygen radial distribution functions (RDF) at different temperatures and 1 g/mL computed via classical MD simulations in the NVT ensemble. The experimental O-O RDF at ambient conditions was obtained from Ref skinner2014structure. b Mean squared displacement (MSD) from the liquid water simulations. Diffusivities ($D$) are shown in the legends.
  • Figure 4: The dihedral scan benchmark of the 3BPA molecule. a Molecular structure and three dihedral angles. b The dihedral potential energy landscape for $\beta=120^{\circ}$, as predicted by DFT and CACE. The white dots on the DFT potential energy surface correspond to all the training configurations that have $\beta$ between 100$^{\circ}$ and 140$^{\circ}$.
  • Figure 5: Benchmark on the HEA25 dataset. a Learning curves for different models. Both alchemical learning (AL) models are from Ref. lopanitsyna2023modeling. b The first two principal components (PCs) of the CACE embedding matrix. The periods are highlighted with orange, blue, and green lines. Interpolated positions for Re and Os are indicated with empty circles. c Energy comparison between DFT and CACE for a high temperature (5000 K) and a low temperature (300 K) trajectory. The inset shows example solid and melt configurations from the 300 K and the 5000 K trajectories, respectively. d Comparison between the substitution energy $E_\mathrm{sub}$, the potential energy difference between the original structure and the structure substituted with Re and Os element per substitution atom, computed using DFT and CACE. The inset shows the parity plot for the force components computed for those structures.