BoostMD: Accelerating molecular sampling by leveraging ML force field features from previous time-steps
Lars L. Schaaf, Ilyes Batatia, Christoph Brunken, Thomas D. Barrett, Jules Tilly
TL;DR
BoostMD introduces a surrogate architecture that speeds up ML force-field-guided molecular dynamics by reusing node features from previous time steps to predict current energy changes. The method evaluates the expensive reference MLFF only every $N$ steps, while a lightweight, equivariant BoostMD model handles intermediate steps, delivering substantial speedups without sacrificing accuracy. It relies on a reference-framing scheme and equivariant message passing to maintain physical consistency and momentum conservation between boost steps. Empirical results on dipeptide systems show up to an $8\times$ speedup with robust generalization to unseen molecules and accurate Boltzmann-sampled sampling, suggesting BoostMD as a practical path toward long-timescale, high-accuracy MD simulations.
Abstract
Simulating atomic-scale processes, such as protein dynamics and catalytic reactions, is crucial for advancements in biology, chemistry, and materials science. Machine learning force fields (MLFFs) have emerged as powerful tools that achieve near quantum mechanical accuracy, with promising generalization capabilities. However, their practical use is often limited by long inference times compared to classical force fields, especially when running extensive molecular dynamics (MD) simulations required for many biological applications. In this study, we introduce BoostMD, a surrogate model architecture designed to accelerate MD simulations. BoostMD leverages node features computed at previous time steps to predict energies and forces based on positional changes. This approach reduces the complexity of the learning task, allowing BoostMD to be both smaller and significantly faster than conventional MLFFs. During simulations, the computationally intensive reference MLFF is evaluated only every $N$ steps, while the lightweight BoostMD model handles the intermediate steps at a fraction of the computational cost. Our experiments demonstrate that BoostMD achieves an eight-fold speedup compared to the reference model and generalizes to unseen dipeptides. Furthermore, we find that BoostMD accurately samples the ground-truth Boltzmann distribution when running molecular dynamics. By combining efficient feature reuse with a streamlined architecture, BoostMD offers a robust solution for conducting large-scale, long-timescale molecular simulations, making high-accuracy ML-driven modeling more accessible and practical.
