Ensemble Learning of Machine Learning Force Fields
Bangchen Yin, Yue Yin, Yuda W. Tang, Hai Xiao
TL;DR
This work introduces EL-MLFFs, a stacking-based ensemble framework that fuses diverse pre-trained ML force fields through a graph neural network meta-learner to deliver accurate and stable force predictions for molecular and materials simulations. It offers two meta-model variants—a direct-fitting GNN and a conservative energy-conserving model—demonstrating substantial reductions in force errors and improved long-term stability across methane, methanol/Cu(100), MD17, and MatPES datasets. The approach scales to large, out-of-domain datasets and enables a practical efficiency–fidelity trade-off, with direct ensembles favoring speed and conservative ensembles ensuring physical conservativity. Collectively, EL-MLFFs provides a principled framework to mitigate the paradox of choice in MLFFs, enhancing reliability and generalization for both molecular dynamics and materials science applications.
Abstract
Machine learning force fields (MLFFs) are a promising approach to balance the accuracy of quantum mechanics with the efficiency of classical potentials, yet selecting an optimal model amid increasingly diverse architectures that delivers reliable force predictions and stable simulations remains a core pratical challenge. Here we introduce EL-MLFFs, an ensemble learning framework that uses a stacking methodology to integrate predictions from diverse base MLFFs. Our approach constructs a graph representation where a graph neural network (GNN) acts as a meta-model to refine the initial force predictions. We present two meta-model architectures: a computationally efficient direct fitting model and a physically-principled conservative model that ensures energy conservation. The framework is evaluated on a diverse range of systems, including single molecules (methane), surface chemistry (methanol/Cu(100)), molecular dynamics benchmarks (MD17), and the MatPES materials dataset. Results show that EL-MLFFs improves predictive accuracy across these domains. For molecular systems, it reduces force errors and improves the simulation stability compared to base models. For materials, the method yields lower formation energy errors on the WBM test set. The EL- MLFFs framework offers a systematic approach to address challenges of model selection and the accuracy-stability trade-off in molecular and materials simulations.
