On the design space between molecular mechanics and machine learning force fields
Yuanqing Wang, Kenichiro Takaba, Michael S. Chen, Marcus Wieder, Yuzhi Xu, Tong Zhu, John Z. H. Zhang, Arnav Nagle, Kuang Yu, Xinyan Wang, Daniel J. Cole, Joshua A. Rackers, Kyunghyun Cho, Joe G. Greener, Peter Eastman, Stefano Martiniani, Mark E. Tuckerman
TL;DR
This paper maps the design space between traditional molecular mechanics and modern machine-learning force fields, arguing that current ML force fields, while accurate, remain prohibitively slow for large biomolecular systems. It synthesizes the core desiderata—especially invariance, linear scaling, energy conservation, differentiability, universality, and stability—and reviews the MM and MLFF building blocks, including energy decompositions, graph-based representations, and geometry-aware architectures. The authors propose a pathway to the next generation of force fields: fast, universally expressive models with physically informed biases, enabled by differentiable simulation, graph perception, and scalable data ecosystems; hybrid MM/ML approaches and foundation-model-inspired strategies are highlighted as plausible routes. They emphasize practical considerations such as datasets, training practices, and the integration of ML plugins into MM platforms, concluding that achieving a fast yet QM-accurate force field would have substantial practical impact for biomolecular modeling and drug discovery.
Abstract
A force field as accurate as quantum mechanics (QM) and as fast as molecular mechanics (MM), with which one can simulate a biomolecular system efficiently enough and meaningfully enough to get quantitative insights, is among the most ardent dreams of biophysicists -- a dream, nevertheless, not to be fulfilled any time soon. Machine learning force fields (MLFFs) represent a meaningful endeavor towards this direction, where differentiable neural functions are parametrized to fit ab initio energies, and furthermore forces through automatic differentiation. We argue that, as of now, the utility of the MLFF models is no longer bottlenecked by accuracy but primarily by their speed (as well as stability and generalizability), as many recent variants, on limited chemical spaces, have long surpassed the chemical accuracy of $1$ kcal/mol -- the empirical threshold beyond which realistic chemical predictions are possible -- though still magnitudes slower than MM. Hoping to kindle explorations and designs of faster, albeit perhaps slightly less accurate MLFFs, in this review, we focus our attention on the design space (the speed-accuracy tradeoff) between MM and ML force fields. After a brief review of the building blocks of force fields of either kind, we discuss the desired properties and challenges now faced by the force field development community, survey the efforts to make MM force fields more accurate and ML force fields faster, envision what the next generation of MLFF might look like.
