Table of Contents
Fetching ...

Pre-training, fine-tuning, and distillation (PFD): Automatically generating machine learning force fields from universal models

Ruoyu Wang, Yuxiang Gao, Hongyu Wu, Zhicheng Zhong

TL;DR

The paper tackles the challenge of obtaining material-specific, first-principles-accurate force fields without prohibitive data requirements. It introduces PFD, a workflow that starts from a pre-trained universal force field and uses iterative fine-tuning on small DFT datasets, followed by distillation to a fast, local-descriptor force field, facilitated by the PFD-kit. Across bulk and complex materials, PFD achieves substantial reductions in required DFT data (1–2 orders of magnitude) while delivering accuracy comparable to first-principles calculations and enabling large-scale MD simulations that are impractical with traditional training. This approach has broad implications for scalable, high-precision materials modeling, including interfaces, amorphous phases, and high-entropy systems, potentially transforming production-level force-field generation in computational materials science.

Abstract

Universal force fields generalizable across the periodic table represent a new trend in computational materials science. However, the applications of universal force fields in material simulations are limited by their slow inference speed and the lack of first-principles accuracy. Instead of building a single model simultaneously satisfying these characteristics, a strategy that quickly generates material-specific models from the universal model may be more feasible. Here, we propose a new workflow pattern, PFD (Pre-training, Fine-tuning, and Distillation), which automatically generates machine-learning force fields for specific materials from a pre-trained universal model through fine-tuning and distillation. By fine-tuning the pre-trained model, our PFD workflow generates force fields with first-principles accuracy while requiring one to two orders of magnitude less training data compared to traditional methods. The inference speed of the generated force field is further improved through distillation, meeting the requirements of large-scale molecular simulations. Comprehensive testing across diverse materials including complex systems, such as amorphous carbon, interface, etc., reveals marked enhancements in training efficiency, which suggests the PFD workflow a practical and reliable approach for force field generation in computational material sciences.

Pre-training, fine-tuning, and distillation (PFD): Automatically generating machine learning force fields from universal models

TL;DR

The paper tackles the challenge of obtaining material-specific, first-principles-accurate force fields without prohibitive data requirements. It introduces PFD, a workflow that starts from a pre-trained universal force field and uses iterative fine-tuning on small DFT datasets, followed by distillation to a fast, local-descriptor force field, facilitated by the PFD-kit. Across bulk and complex materials, PFD achieves substantial reductions in required DFT data (1–2 orders of magnitude) while delivering accuracy comparable to first-principles calculations and enabling large-scale MD simulations that are impractical with traditional training. This approach has broad implications for scalable, high-precision materials modeling, including interfaces, amorphous phases, and high-entropy systems, potentially transforming production-level force-field generation in computational materials science.

Abstract

Universal force fields generalizable across the periodic table represent a new trend in computational materials science. However, the applications of universal force fields in material simulations are limited by their slow inference speed and the lack of first-principles accuracy. Instead of building a single model simultaneously satisfying these characteristics, a strategy that quickly generates material-specific models from the universal model may be more feasible. Here, we propose a new workflow pattern, PFD (Pre-training, Fine-tuning, and Distillation), which automatically generates machine-learning force fields for specific materials from a pre-trained universal model through fine-tuning and distillation. By fine-tuning the pre-trained model, our PFD workflow generates force fields with first-principles accuracy while requiring one to two orders of magnitude less training data compared to traditional methods. The inference speed of the generated force field is further improved through distillation, meeting the requirements of large-scale molecular simulations. Comprehensive testing across diverse materials including complex systems, such as amorphous carbon, interface, etc., reveals marked enhancements in training efficiency, which suggests the PFD workflow a practical and reliable approach for force field generation in computational material sciences.

Paper Structure

This paper contains 8 sections, 8 figures, 2 tables.

Figures (8)

  • Figure 1: a) Schematic of the PFD concept. A material-specific model with first-principles accuracy is trained by fine-tuning a pre-trained universal model using a small set of first-principles calculations. For improved simulation speed, a simplified model is trained using the dataset generated and labeled by the fine-tuned model.b) Iterative workflow of PFD-kit. The procedures used in the current study consists of two phases. 1. In a fine-tuning iteration, a foundation model, such as the DPA-2, is first fine-tuned using a small dataset with DFT energies and forces. The fine-tuned model generates new configurations by running molecular dynamics (MD) simulations on the perturbed structures. Candidate configurations are then sampled from the MD trajectories for DFT labeling, and are grouped as an iteration dataset. The fine-tuned model is tested on the iteration dataset for energy and force error. Based on the convergence criteria, the iteration would either end with the fine-tuned model output or repeat after adding the iteration dataset into the fine-tune dataset. 2. In distillation phase, the fine-tuned model is used to generate and label a large quantity of synthetic data, which is then used to train a randomly initialized “student” model of simpler design, such as the DeePMD model with local descriptor.
  • Figure 2: Data efficiency of fine-tuning. The figure illustrates the fine-tuning performance of a pre-trained DPA-2 model using subsets of the argyrodite Li$_6$PS$_5$X(X=Cl,Br,I) solid electrolyte as well as its decomposition subsystems of Li$_2$S, LiX, P$_2$S$_5$ and LiPS$_3$. It highlights the convergence of energy and atomic force prediction with increasing dataset sizes. The crystal structure of Li$_6$PS$_5$X is also presented.
  • Figure 3: Crystalline Si force fields generated from PFD workflow.a) Energy and b) force accuracy of the fine-tuned (denoted as PF) model for various Si phases. c) Energy-volume curve of Si polymorphs predicted by the PF and distilled (denoted as PFD) model. d) Predicted vacancy formation energy $\Delta E_{\mathrm{vac}}$ of various Si phases. Note that only the vacancy states of diamond Si are included in the training data. e) Phonon dispersion curves of the ground state diamond Si predicted by PF and PFD model, as well as the DFT calculation. f) Computation efficiency of the PF and PFD models. The average CPU walltime per atom$\cdot$step is measured on a computation node with a single NVIDIA V100 32GB card.
  • Figure 4: Ion transport in Li$_{1+x}$Al$_x$Ti$_{2-x}$(PO$_4$)$_3$ solid electrolyte. The a) energy and the b) force prediction error of the distilled model for Li$_{1+x}$Al$_x$Ti$_{2-x}$(PO$_4$)$_3$ solid electrolyte. c) The temperature-dependent diffusion coefficients $D$ of Li$_{1.3}$Al$_{0.3}$Ti$_{1.7}$(PO$_4$)$_3$ calculated using the distilled model for simulation cells of 110 and 3520 atoms, respectively. For comparison, high-temperature $D$ calculated using AIMD simulations in a small cell from a previous studyhe_origin_2017 are also listed here. The interpolated activation energy barrier is 0.18 eV.
  • Figure 5: Fine-tuned model for polyisoprene chain.a) A cluster of several polyisoprene chains. The inset image is one single polyisoprene chain which consists of several 1,4-polyisoprene building blocks. The b) energy and the c) force prediction accuracy of the fine-tuned model generated using the PFD workflow. The b) inset indicates an overall energy shift for the cluster system possibly due to inter-chain interactions that is not explicitly included in training dataset.
  • ...and 3 more figures