Pretraining Strategy for Neural Potentials

Zehua Zhang; Zijie Li; Amir Barati Farimani

Pretraining Strategy for Neural Potentials

Zehua Zhang, Zijie Li, Amir Barati Farimani

TL;DR

The paper addresses data efficiency in neural potentials for molecular dynamics by introducing a masked pretraining objective for GNNs that recovers spatial information from selectively masked atoms. The method defines the masking loss as $\mathcal{L}_{masking} = 1 - \frac{\sum_i \hat{d_i} \cdot d_i}{\sqrt{\sum_i \hat{d_i}^2} \cdot \sqrt{\sum_i d_i^2}}$, encouraging recovery of relative atomic displacements, and compares with denoising pretraining defined by $\mathcal{L}_{denoising} = \|\hat{\xi} - \xi\|^2$. Applied to both energy-centric and force-centric GNNs on RPBE and Tip3p water datasets, masked pretraining improves force and energy RMSE and accelerates convergence, with transfer to the Revised MD17 dataset showing strong generalization to small organic molecules. The findings suggest masked pretraining is a practical, model-agnostic boost for neural potentials, with potential extensions to mask other atom types and scale to larger GNNs.

Abstract

We propose a mask pretraining method for Graph Neural Networks (GNNs) to improve their performance on fitting potential energy surfaces, particularly in water systems. GNNs are pretrained by recovering spatial information related to masked-out atoms from molecules, then transferred and finetuned on atomic forcefields. Through such pretraining, GNNs learn meaningful prior about structural and underlying physical information of molecule systems that are useful for downstream tasks. From comprehensive experiments and ablation studies, we show that the proposed method improves the accuracy and convergence speed compared to GNNs trained from scratch or using other pretraining techniques such as denoising. On the other hand, our pretraining method is suitable for both energy-centric and force-centric GNNs. This approach showcases its potential to enhance the performance and data efficiency of GNNs in fitting molecular force fields.

Pretraining Strategy for Neural Potentials

TL;DR

, encouraging recovery of relative atomic displacements, and compares with denoising pretraining defined by

. Applied to both energy-centric and force-centric GNNs on RPBE and Tip3p water datasets, masked pretraining improves force and energy RMSE and accelerates convergence, with transfer to the Revised MD17 dataset showing strong generalization to small organic molecules. The findings suggest masked pretraining is a practical, model-agnostic boost for neural potentials, with potential extensions to mask other atom types and scale to larger GNNs.

Abstract

Paper Structure (14 sections, 8 equations, 3 figures, 2 tables)

This paper contains 14 sections, 8 equations, 3 figures, 2 tables.

Introduction
Methodology
Graph Neural Networks
Pretraining by masking
Pretraining by Denoising
Experiment
Experiment Set up on Water Datasets
Experiment Results
Comparison over models trained from scratch with additional epochs
Comparison with Denoising
Experiment on Revised MD17 Dataset
Conclusion
Supplementary Material
Data Availability Statements

Figures (3)

Figure 1: Framework of masked pretraining GNNs on water molecule systems. (a) With water molecules selected on a certain ratio, spatial information of one Hydrogen atom (Blue) in each selected water molecule is masked to create a pretext task. (b) GNNs are pretrained to recover displacements between masked-out atoms and other atoms. Information of any atoms but the masked-out ones is provided for GNNs to predict such displacements. (c) The pretrained weights are transferred and finetuned to predict the potential energy surface.
Figure 2: Force and Energy performance comparison between models trained with regular and extended epochs: 300 epochs and 400 epochs on the RPBE dataset, and 50 epochs and 75 epochs on the Tip3p dataset. $\text{Improvement} = 1 - \frac{{RMSE} _\text{(pretrained + regular epochs)}}{{RMSE}_\text{(train from scratch + regular/extended epochs)}}$. A negative percentage indicates better performance after pretraining.
Figure 3: Force and Energy performance results from models pretrained with masking and denoising tasks on the RPBE and Tip3p datasets

Pretraining Strategy for Neural Potentials

TL;DR

Abstract

Pretraining Strategy for Neural Potentials

Authors

TL;DR

Abstract

Table of Contents

Figures (3)