Overcoming systematic softening in universal machine learning interatomic potentials by fine-tuning
Bowen Deng, Yunyeong Choi, Peichen Zhong, Janosh Riebesell, Shashwat Anand, Zhuohan Li, KyuJung Jun, Kristin A. Persson, Gerbrand Ceder
TL;DR
This work reveals a universal PES softening across three foundational uMLIPs (M3GNet, CHGNet, MACE-MP-0), where energies and forces are systematically underpredicted in high-energy, out-of-distribution environments due to biased pretraining data. The authors provide a mechanistic explanation linking softening to limited PES sampling in the Materials Project dataset and demonstrate that a minimal fine-tuning protocol—specifically, a simple linear correction using a single high-energy data point—substantially mitigates the issue across surfaces, defects, solid-solution energetics, phonons, and ion migration barriers. They quantify the softening with a scale parameter and show that most chemistries exhibit $s<1$, confirming a broad, systematic bias. The findings offer a practical, data-efficient route to correct uMLIPs and argue for richer PES-sampling in future foundational datasets, with significant implications for reliable, scalable atomistic simulations and materials discovery.
Abstract
Machine learning interatomic potentials (MLIPs) have introduced a new paradigm for atomic simulations. Recent advancements have seen the emergence of universal MLIPs (uMLIPs) that are pre-trained on diverse materials datasets, providing opportunities for both ready-to-use universal force fields and robust foundations for downstream machine learning refinements. However, their performance in extrapolating to out-of-distribution complex atomic environments remains unclear. In this study, we highlight a consistent potential energy surface (PES) softening effect in three uMLIPs: M3GNet, CHGNet, and MACE-MP-0, which is characterized by energy and force under-prediction in a series of atomic-modeling benchmarks including surfaces, defects, solid-solution energetics, phonon vibration modes, ion migration barriers, and general high-energy states. We find that the PES softening behavior originates from a systematic underprediction error of the PES curvature, which derives from the biased sampling of near-equilibrium atomic arrangements in uMLIP pre-training datasets. We demonstrate that the PES softening issue can be effectively rectified by fine-tuning with a single additional data point. Our findings suggest that a considerable fraction of uMLIP errors are highly systematic, and can therefore be efficiently corrected. This result rationalizes the data-efficient fine-tuning performance boost commonly observed with foundational MLIPs. We argue for the importance of a comprehensive materials dataset with improved PES sampling for next-generation foundational MLIPs.
