Table of Contents
Fetching ...

Are Foundational Atomistic Models Reliable for Finite-Temperature Molecular Dynamics?

Denan Li, Jiyuan Yang, Xiangkai Chen, Lintao Yu, Shi Liu

TL;DR

This Perspective critically evaluates foundational atomistic models (universal ML force fields) for finite-temperature MD using PbTiO3 as a focused PTO-test, questioning whether static accuracy translates into reliable dynamic performance. It combines ground-state (static) assessments with finite-temperature MD to reveal a potential disconnect: several models reproduce ground-state structure and phonons yet fail to capture temperature-driven phase transitions or exhibit instabilities under MD. A key insight is that training data quality and the choice of exchange–correlation functionals at training time strongly influence dynamic reliability, and simple fine-tuning can substantially improve agreement with known physics, albeit at data and cost considerations. The work highlights the practical challenges of adopting foundational atomistic models—data diversity, scalability, and software integration—while proposing hybrid training strategies and targeted benchmarking as pragmatic paths forward toward robust, scalable MD for materials discovery.

Abstract

Machine learning force fields have emerged as promising tools for molecular dynamics (MD) simulations, potentially offering quantum-mechanical accuracy with the efficiency of classical MD. Inspired by foundational large language models, recent years have seen considerable progress in developing foundational atomistic models, sometimes referred to as universal force fields, designed to cover most elements in the periodic table. This Perspective adopts a practitioner's viewpoint to ask a critical question: Are these foundational atomistic models reliable for one of their most compelling applications, in particular simulating finite-temperature dynamics? Instead of a broad benchmark, we use the canonical ferroelectric-paraelectric phase transition in PbTiO$_3$ as a focused case study to evaluate prominent foundational atomistic models. Our findings suggest a potential disconnect between static accuracy and dynamic reliability. While 0 K properties are often well-reproduced, we observed that the models can struggle to consistently capture the correct phase transition, sometimes exhibiting simulation instabilities. We believe these challenges may stem from inherent biases in training data and a limited description of anharmonicity. These observed shortcomings, though demonstrated on a single system, appear to point to broader, systemic challenges that can be addressed with targeted fine-tuning. This Perspective serves not to rank models, but to initiate a crucial discussion on the practical readiness of foundational atomistic models and to explore future directions for their improvement.

Are Foundational Atomistic Models Reliable for Finite-Temperature Molecular Dynamics?

TL;DR

This Perspective critically evaluates foundational atomistic models (universal ML force fields) for finite-temperature MD using PbTiO3 as a focused PTO-test, questioning whether static accuracy translates into reliable dynamic performance. It combines ground-state (static) assessments with finite-temperature MD to reveal a potential disconnect: several models reproduce ground-state structure and phonons yet fail to capture temperature-driven phase transitions or exhibit instabilities under MD. A key insight is that training data quality and the choice of exchange–correlation functionals at training time strongly influence dynamic reliability, and simple fine-tuning can substantially improve agreement with known physics, albeit at data and cost considerations. The work highlights the practical challenges of adopting foundational atomistic models—data diversity, scalability, and software integration—while proposing hybrid training strategies and targeted benchmarking as pragmatic paths forward toward robust, scalable MD for materials discovery.

Abstract

Machine learning force fields have emerged as promising tools for molecular dynamics (MD) simulations, potentially offering quantum-mechanical accuracy with the efficiency of classical MD. Inspired by foundational large language models, recent years have seen considerable progress in developing foundational atomistic models, sometimes referred to as universal force fields, designed to cover most elements in the periodic table. This Perspective adopts a practitioner's viewpoint to ask a critical question: Are these foundational atomistic models reliable for one of their most compelling applications, in particular simulating finite-temperature dynamics? Instead of a broad benchmark, we use the canonical ferroelectric-paraelectric phase transition in PbTiO as a focused case study to evaluate prominent foundational atomistic models. Our findings suggest a potential disconnect between static accuracy and dynamic reliability. While 0 K properties are often well-reproduced, we observed that the models can struggle to consistently capture the correct phase transition, sometimes exhibiting simulation instabilities. We believe these challenges may stem from inherent biases in training data and a limited description of anharmonicity. These observed shortcomings, though demonstrated on a single system, appear to point to broader, systemic challenges that can be addressed with targeted fine-tuning. This Perspective serves not to rank models, but to initiate a crucial discussion on the practical readiness of foundational atomistic models and to explore future directions for their improvement.

Paper Structure

This paper contains 15 sections, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Lattice parameter $a$ and tetragonality ($c/a$) of ground-state PbTiO$_3$ predicted by various MLFFs and exchange-correlation functionals.
  • Figure 2: Phonon spectra of PbTiO$_3$ calculated using various MLFFs, each based on the optimized ground-state tetragonal structure. The panels show results for: (a) PBE, (b) CHGNet, (c) GPTFF, (d) M3GNet, (e) MACE, (f) ORB, (g) SevenNet, (h) PBEsol, (i) UniPero, and (j) MACE-FT. The spectra obtained from (a) PBE and (h) PBEsol are also included for comparison. (i) UniPero and (j) MACE-FT are trained on a PBEsol-derived database.
  • Figure 3: Temperature-dependent lattice constants ($a$ and $c$) obtained from $NPT$ MD simulations using various machine learning force fields. The panels show results for: (a) CHGNet, (b) M3GNet, (c) MACE, (d) ORB, (e) SevenNet, (f) GPTFF, (g) MACE-FT, and (h) UniPero. The dashed lines indicate the ground-state lattice parameters ($a_0$ and $c_0$) of tetragonal PbTiO$_3$ for each force field. The error bars represent the standard deviation over the 50-ps production trajectory, reflecting the extent of thermal fluctuations.
  • Figure 4: Temperature-dependent spontaneous polarization along the $c$-axis ($P_z$) obtained from $NVT$ MD simulations, with lattice parameters fixed to the experimental room-temperature values ($a = 3.90$ Å, $c = 4.15$ Å). At high temperatures, the polarization does not fully converge to zero due to the imposed tetragonality constraint ($c/a = 1.06$). The error bars represent the extent of polarization fluctuations arising from thermal effects over the 50-ps production trajectory.
  • Figure 5: Computational efficiency benchmark. The reported speed data are for reference only, as a model's optimal performance can be further improved through careful tuning and the implementation of multi-GPU parallelism.
  • ...and 1 more figures