Table of Contents
Fetching ...

Fine-Tuning Unifies Foundational Machine-learned Interatomic Potential Architectures at ab initio Accuracy

Jonas Hänseroth, Aaron Flötotto, Muhammad Nawaz Qaisrani, Christian Dreßler

TL;DR

This work demonstrates that fine-tuning transforms foundational machine-learned interatomic potentials (MLIPs) to achieve consistent, near-ab initio accuracy across diverse architectures and introduces the aMACEing Toolkit, which provides a unified and reproducible interface for fine-tuning workflows across multiple MLIP frameworks.

Abstract

This work demonstrates that fine-tuning transforms foundational machine-learned interatomic potentials (MLIPs) to achieve consistent, near-ab initio accuracy across diverse architectures. Benchmarking five leading MLIP frameworks (MACE, GRACE, SevenNet, MatterSim, and ORB) across seven chemically diverse compounds reveals that fine-tuning universally enhances force predictions by factors of 5-15 and improves energy accuracy by 2-4 orders of magnitude. The investigated models span both equivariant and invariant, as well as conservative and non-conservative, architectures. While general-purpose foundation models are robust, they exhibit architecture-dependent deviations from ab initio reference data; fine-tuning eliminates these discrepancies, enabling quantitatively accurate predictions of atomistic and structural properties. Using datasets constructed from equidistantly sampled frames of short ab initio molecular dynamics trajectories, fine-tuning reduces force errors by an order of magnitude and harmonizes performance across all architectures. These findings establish fine-tuning as a universal route to achieving system-specific predictive accuracy while preserving the computational efficiency of MLIPs. To promote widespread adoption, we introduce the aMACEing Toolkit, which provides a unified and reproducible interface for fine-tuning workflows across multiple MLIP frameworks.

Fine-Tuning Unifies Foundational Machine-learned Interatomic Potential Architectures at ab initio Accuracy

TL;DR

This work demonstrates that fine-tuning transforms foundational machine-learned interatomic potentials (MLIPs) to achieve consistent, near-ab initio accuracy across diverse architectures and introduces the aMACEing Toolkit, which provides a unified and reproducible interface for fine-tuning workflows across multiple MLIP frameworks.

Abstract

This work demonstrates that fine-tuning transforms foundational machine-learned interatomic potentials (MLIPs) to achieve consistent, near-ab initio accuracy across diverse architectures. Benchmarking five leading MLIP frameworks (MACE, GRACE, SevenNet, MatterSim, and ORB) across seven chemically diverse compounds reveals that fine-tuning universally enhances force predictions by factors of 5-15 and improves energy accuracy by 2-4 orders of magnitude. The investigated models span both equivariant and invariant, as well as conservative and non-conservative, architectures. While general-purpose foundation models are robust, they exhibit architecture-dependent deviations from ab initio reference data; fine-tuning eliminates these discrepancies, enabling quantitatively accurate predictions of atomistic and structural properties. Using datasets constructed from equidistantly sampled frames of short ab initio molecular dynamics trajectories, fine-tuning reduces force errors by an order of magnitude and harmonizes performance across all architectures. These findings establish fine-tuning as a universal route to achieving system-specific predictive accuracy while preserving the computational efficiency of MLIPs. To promote widespread adoption, we introduce the aMACEing Toolkit, which provides a unified and reproducible interface for fine-tuning workflows across multiple MLIP frameworks.

Paper Structure

This paper contains 16 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: Fine-tuning of a pre-trained foundation machine learning interatomic potential.
  • Figure 2: Modules and functions of the aMACEing Toolkit.
  • Figure 3: Root mean squared force errors for foundation and fine-tuned models across all evaluated systems: CsH2PO4 (CDP), Cs7(H4PO4)(H2PO4)8 (CPP), Li13Si4, solvated PhOH, aqueous KOH solution, L-pyroglutamate-ammonium (L-Pyro), MoS2; and frameworks with the respective foundation models: MACE-MP-0, GRACE-1L-OAM, SevenNet-0, MatterSim-Large, ORB-v2. Force errors in meV Å$^{-1}$ and average error reduction in percent.
  • Figure 4: Comparison of different physical properties obtained with first principles methods, foundation models and fine-tuned foundation models: (a & b) CDP and CPP, proton diffusion coefficients ratios of D(CDP)/D(CPP) (MatterSim), (c) Li13Si4, lithium ion mean-squared displacements and diffusion coefficients (ORB), (d) Phenol in water, (O--H)Hydroxyl-Group bond length distribution (SevenNet), (e) KOH in water, water molecule and hydroxide ion mean-squared displacements and OHydroxide-Ion-OWater radial distribution function (GRACE), (f) L-pyroglutamate-ammonium, free energy profiles along the proton transfer coordinate (ORB), (g) MoS2, potential energy curves for a sulfur jump into a neighboring line of sulfur vacancies (MACE).
  • Figure 5: Free energy profiles along the proton transfer coordinate of the short-hydrogen-bond in L-pyroglutamate-NH4 computed using different MLIP frameworks. Results from the foundation model and the fine-tuned foundation model are compared against AIMD reference data.