Pretraining Strategy for Neural Potentials
Zehua Zhang, Zijie Li, Amir Barati Farimani
TL;DR
The paper addresses data efficiency in neural potentials for molecular dynamics by introducing a masked pretraining objective for GNNs that recovers spatial information from selectively masked atoms. The method defines the masking loss as $\mathcal{L}_{masking} = 1 - \frac{\sum_i \hat{d_i} \cdot d_i}{\sqrt{\sum_i \hat{d_i}^2} \cdot \sqrt{\sum_i d_i^2}}$, encouraging recovery of relative atomic displacements, and compares with denoising pretraining defined by $\mathcal{L}_{denoising} = \|\hat{\xi} - \xi\|^2$. Applied to both energy-centric and force-centric GNNs on RPBE and Tip3p water datasets, masked pretraining improves force and energy RMSE and accelerates convergence, with transfer to the Revised MD17 dataset showing strong generalization to small organic molecules. The findings suggest masked pretraining is a practical, model-agnostic boost for neural potentials, with potential extensions to mask other atom types and scale to larger GNNs.
Abstract
We propose a mask pretraining method for Graph Neural Networks (GNNs) to improve their performance on fitting potential energy surfaces, particularly in water systems. GNNs are pretrained by recovering spatial information related to masked-out atoms from molecules, then transferred and finetuned on atomic forcefields. Through such pretraining, GNNs learn meaningful prior about structural and underlying physical information of molecule systems that are useful for downstream tasks. From comprehensive experiments and ablation studies, we show that the proposed method improves the accuracy and convergence speed compared to GNNs trained from scratch or using other pretraining techniques such as denoising. On the other hand, our pretraining method is suitable for both energy-centric and force-centric GNNs. This approach showcases its potential to enhance the performance and data efficiency of GNNs in fitting molecular force fields.
