Table of Contents
Fetching ...

Low-rank matrix and tensor approximations for compression of machine-learning interatomic potentials

Igor Vorotnikov, Fedor Romashov, Nikita Rybin, Maxim Rakhuba, Ivan S. Novikov

TL;DR

The paper addresses the high parameter burden of MLIPs by introducing low-rank matrix and tensor factorizations to compress radial parameter tensors within Moment Tensor Potentials and related MLIPs. It develops two complementary strategies: optimization under fixed-rank constraints (MF, R-MF, TF) and rank augmentation (MFRA), and demonstrates their effectiveness across Mo-Nb-Ta-W, FLiNaK, and glycine systems, achieving up to $50\%$ compression without sacrificing accuracy and often improving it. The work further shows that compressed MTPs compare favorably with equivariant tensor networks and can be extended to ACE, preserving energy/force predictions and density-temperature behavior, while reducing memory and computation. Overall, the approach provides a universal, scalable route to faster, data-efficient MLIP-based simulations with broader applicability to diverse MLIP architectures.

Abstract

Machine-learning interatomic potentials (MLIPs) have become a mainstay in computationally-guided materials science, surpassing traditional force fields due to their flexible functional form and superior accuracy in reproducing physical properties of materials. This flexibility is achieved through mathematically-rigorous basis sets that describe interatomic interactions within a local atomic environment. The number of parameters in these basis sets influences both the size of the training dataset required and the computational speed of the MLIP. Consequently, compressing MLIPs by reducing the number of parameters is a promising route to more efficient simulations. In this work, we use low-rank matrix and tensor factorizations under fixed-rank constraints to achieve this compression. In addition, we demonstrate that an algorithm with automatic rank augmentation helps to find a deeper local minimum of the fitted potential. The methodology is mainly verified using the Moment Tensor Potential (MTP) model and benchmarked on multi-component systems: a Mo-Nb-Ta-W medium-entropy alloy, molten LiF-NaF-KF, and a glycine molecular crystal. The proposed approach achieves up to 50 % compression without any loss of MTP accuracy. We also demonstrate that the developed methodology is universal and can be applied to compress other MLIPs on the example of Atomic Cluster Expansion (ACE).

Low-rank matrix and tensor approximations for compression of machine-learning interatomic potentials

TL;DR

The paper addresses the high parameter burden of MLIPs by introducing low-rank matrix and tensor factorizations to compress radial parameter tensors within Moment Tensor Potentials and related MLIPs. It develops two complementary strategies: optimization under fixed-rank constraints (MF, R-MF, TF) and rank augmentation (MFRA), and demonstrates their effectiveness across Mo-Nb-Ta-W, FLiNaK, and glycine systems, achieving up to compression without sacrificing accuracy and often improving it. The work further shows that compressed MTPs compare favorably with equivariant tensor networks and can be extended to ACE, preserving energy/force predictions and density-temperature behavior, while reducing memory and computation. Overall, the approach provides a universal, scalable route to faster, data-efficient MLIP-based simulations with broader applicability to diverse MLIP architectures.

Abstract

Machine-learning interatomic potentials (MLIPs) have become a mainstay in computationally-guided materials science, surpassing traditional force fields due to their flexible functional form and superior accuracy in reproducing physical properties of materials. This flexibility is achieved through mathematically-rigorous basis sets that describe interatomic interactions within a local atomic environment. The number of parameters in these basis sets influences both the size of the training dataset required and the computational speed of the MLIP. Consequently, compressing MLIPs by reducing the number of parameters is a promising route to more efficient simulations. In this work, we use low-rank matrix and tensor factorizations under fixed-rank constraints to achieve this compression. In addition, we demonstrate that an algorithm with automatic rank augmentation helps to find a deeper local minimum of the fitted potential. The methodology is mainly verified using the Moment Tensor Potential (MTP) model and benchmarked on multi-component systems: a Mo-Nb-Ta-W medium-entropy alloy, molten LiF-NaF-KF, and a glycine molecular crystal. The proposed approach achieves up to 50 % compression without any loss of MTP accuracy. We also demonstrate that the developed methodology is universal and can be applied to compress other MLIPs on the example of Atomic Cluster Expansion (ACE).

Paper Structure

This paper contains 25 sections, 2 theorems, 65 equations, 16 figures, 4 tables, 2 algorithms.

Key Result

Proposition 4.1

Let $\mathcal{M}\subseteq \mathbb{R}^N$ be a Riemannian manifold of the dimensionality $M$. Let $B_x, B_y \in \mathbb{R}^{N \times M}$ be matrices of orthogonal bases in $T_x$ and $T_y$ respectively, and the operator of vector transport $\mathcal{T}_{x \to y}$ be a restriction of $\operatorname{Proj

Figures (16)

  • Figure 1: Illustration of the concepts of a tangent space $T_x \mathcal{M}$ and a retraction $R(x,\xi)$ for a smooth manifold $\mathcal{M}$.
  • Figure 2: Dependence of the loss function calculated on the Mo-Nb-Ta-W validation set on the rank of (a) the MF (b) the R-MF (c) the TF MTPs of the 14th level. The loss function for the base MTP is also shown. Error bars demonstrate uncertainty of the loss function prediction and are given within 1-$\sigma$ confidence interval.
  • Figure 3: Histogram with the ranks and the number of parameters (linear and radial) for potentials of the 14th and 18th levels fitted on the Mo-Nb-Ta-W training set.
  • Figure 4: Histogram with the ranks and the number of parameters (linear and radial) for potentials of the 14th and 18th levels fitted on the FLiNaK training set.
  • Figure 5: Root mean square errors (RMSEs) for energies and forces calculated with ETN and the compressed MTPs on the Mo-Nb-Ta-W validation set. We provide the results with 68% confidence interval (i.e., 1-$\sigma$ interval).
  • ...and 11 more figures

Theorems & Definitions (5)

  • Proposition 4.1
  • proof
  • Proposition 4.2
  • proof
  • Remark