Table of Contents
Fetching ...

TACE: A unified Irreducible Cartesian Tensor Framework for Atomistic Machine Learning

Zemin Xu, Wenbo Xie, Daiqian Xie, P. Hu

TL;DR

TACE introduces a unified Cartesian-space framework that decomposes atomic environments into irreducible Cartesian tensors, enabling exact symmetry-consistent prediction of scalar and tensorial properties while incorporating external fields, charges, and magnetism. It combines universal invariant and equivariant embeddings with a Latent Ewald Summation module to handle long-range interactions, and validates performance across diverse datasets, including liquids, magnets, charged systems, and external-field scenarios, often matching or surpassing spherical-tensor-based methods. The work demonstrates strong extrapolation, multi-fidelity learning, and robustness, suggesting Cartesian irreducible-tensor approaches can underpin next-generation universal atomistic potentials. Overall, TACE offers a scalable, flexible blueprint for capturing geometry-field-property interplay within a single coherent, extensible framework.

Abstract

Here, we introduce the Tensor Atomic Cluster Expansion (TACE), a unified framework formulated entirely in Cartesian space, enabling systematic and consistent prediction of arbitrary structure-dependent tensorial properties. TACE achieves this by decomposing atomic environments into a complete hierarchy of irreducible Cartesian tensors, ensuring symmetry-consistent representations that naturally encode invariance and equivariance constraints. Beyond geometry, TACE incorporates universal embeddings that flexibly integrate diverse attributes including computational levels, charges, magnetic moments and field perturbations. This allows explicit control over external invariants and equivariants in the prediction process. Long-range interactions are also accurately described through the Latent Ewald Summation module within the short-range approximation, providing a rigorous yet computationally efficient treatment of electrostatic and dispersion effects. We demonstrate that TACE attains accuracy, stability, and efficiency on par with or surpassing leading equivariant frameworks across finite molecules and extended materials. This includes in-domain and out-of-domain benchmarks, spectra, Hessian, external-field responses, charged and magnetic systems, multi-fidelity training, heterogeneous catalysis, and even superior performance within the uMLIP benchmark. Crucially, TACE bridges scalar and tensorial modeling and establishes a Cartesian-space paradigm that unifies and extends beyond the design space of spherical-tensor-based methods. This work lays the foundation for a new generation of universal atomistic machine learning models capable of systematically capturing the rich interplay of geometry, fields and material properties within a single coherent framework.

TACE: A unified Irreducible Cartesian Tensor Framework for Atomistic Machine Learning

TL;DR

TACE introduces a unified Cartesian-space framework that decomposes atomic environments into irreducible Cartesian tensors, enabling exact symmetry-consistent prediction of scalar and tensorial properties while incorporating external fields, charges, and magnetism. It combines universal invariant and equivariant embeddings with a Latent Ewald Summation module to handle long-range interactions, and validates performance across diverse datasets, including liquids, magnets, charged systems, and external-field scenarios, often matching or surpassing spherical-tensor-based methods. The work demonstrates strong extrapolation, multi-fidelity learning, and robustness, suggesting Cartesian irreducible-tensor approaches can underpin next-generation universal atomistic potentials. Overall, TACE offers a scalable, flexible blueprint for capturing geometry-field-property interplay within a single coherent, extensible framework.

Abstract

Here, we introduce the Tensor Atomic Cluster Expansion (TACE), a unified framework formulated entirely in Cartesian space, enabling systematic and consistent prediction of arbitrary structure-dependent tensorial properties. TACE achieves this by decomposing atomic environments into a complete hierarchy of irreducible Cartesian tensors, ensuring symmetry-consistent representations that naturally encode invariance and equivariance constraints. Beyond geometry, TACE incorporates universal embeddings that flexibly integrate diverse attributes including computational levels, charges, magnetic moments and field perturbations. This allows explicit control over external invariants and equivariants in the prediction process. Long-range interactions are also accurately described through the Latent Ewald Summation module within the short-range approximation, providing a rigorous yet computationally efficient treatment of electrostatic and dispersion effects. We demonstrate that TACE attains accuracy, stability, and efficiency on par with or surpassing leading equivariant frameworks across finite molecules and extended materials. This includes in-domain and out-of-domain benchmarks, spectra, Hessian, external-field responses, charged and magnetic systems, multi-fidelity training, heterogeneous catalysis, and even superior performance within the uMLIP benchmark. Crucially, TACE bridges scalar and tensorial modeling and establishes a Cartesian-space paradigm that unifies and extends beyond the design space of spherical-tensor-based methods. This work lays the foundation for a new generation of universal atomistic machine learning models capable of systematically capturing the rich interplay of geometry, fields and material properties within a single coherent framework.

Paper Structure

This paper contains 29 sections, 20 equations, 4 figures, 11 tables.

Figures (4)

  • Figure 1: Irreducible Cartesian tensor and the TACE architecture.a Taking the rank-2 case as an example, ICTD can extract the corresponding irreducible components with respect to a given weight. Through the basis transformation between Cartesian and spherical basis, the lower-weight parts of Cartesian tensors can be converted to "$\mathrm{rank}=\mathrm{weight}"$. b Illustration of the overall data flow in TACE. The embeddings of invariants and equivariants are generic. In this work we demonstrate embeddings for different computational levels, charges, external fields, and magnetic moments, but in principle other quantities are also supported. We also support the use of ZBL ZBL and Latent Ewald Summation LES1LES2LES3LES4 as plugins.
  • Figure 2: Comparison of Different Mixed Precision Training Approaches. From left to right, the basis sets are STO-3G, 3-21G, 6-31G, def2-SVP, and def2-TZVP. The test set is the same for each column. a-e show TACE using single-fidelity training. f-j show TACE using multi-fidelity training. k-o show TACE using multi-head training, p-t show MACE using multi-head training. u-y show TACE using both multi-fidelity and multi-head schemes.
  • Figure 3: Water simulation outcomes with $\mathbf{L_{max}=0}$.a O-O RDFs at different temperatures from NVT MD simulations and X-ray diffraction. b IR spectra. c Raman spectra.
  • Figure 4: Phonon Spectrum of Diamond. Phonon dispersion relations of diamond predicted by TACE models trained on the GAP17 GAP17 and GAP20 GAP20 datasets, compared with DFT results HotPP.