Table of Contents
Fetching ...

Grappa -- A Machine Learned Molecular Mechanics Force Field

Leif Seute, Eric Hartmann, Jan Stühmer, Frauke Gräter

TL;DR

Grappa addresses the tension between accuracy and efficiency in molecular mechanics by learning MM parameters directly from molecular graphs with a graph attentional network and a symmetry-preserving transformer. This yields a force field that matches or surpasses state-of-the-art MM performance while preserving the computational efficiency of classical MM, enabling MD in established engines like GROMACS and OpenMM. The approach demonstrates transferability to large biomolecules, extensibility to novel chemistries (including radicals), and compatibility with multiple nonbonded schemes, with robustness verified on proteins and viruses and favorable resource requirements compared to E(3) equivariant models. Overall, Grappa provides a practical, interpretable, and scalable framework for parameterizing MM force fields across broad chemical space at near-MM costs.

Abstract

Simulating large molecular systems over long timescales requires force fields that are both accurate and efficient. In recent years, E(3) equivariant neural networks have lifted the tension between computational efficiency and accuracy of force fields, but they are still several orders of magnitude more expensive than established molecular mechanics (MM) force fields. Here, we propose Grappa, a machine learning framework to predict MM parameters from the molecular graph, employing a graph attentional neural network and a transformer with symmetry-preserving positional encoding. The resulting Grappa force field outperformstabulated and machine-learned MM force fields in terms of accuracy at the same computational efficiency and can be used in existing Molecular Dynamics (MD) engines like GROMACS and OpenMM. It predicts energies and forces of small molecules, peptides, RNA and - showcasing its extensibility to uncharted regions of chemical space - radicals at state-of-the-art MM accuracy. We demonstrate Grappa's transferability to macromolecules in MD simulations from a small fast folding protein up to a whole virus particle. Our force field sets the stage for biomolecular simulations closer to chemical accuracy, but with the same computational cost as established protein force fields.

Grappa -- A Machine Learned Molecular Mechanics Force Field

TL;DR

Grappa addresses the tension between accuracy and efficiency in molecular mechanics by learning MM parameters directly from molecular graphs with a graph attentional network and a symmetry-preserving transformer. This yields a force field that matches or surpasses state-of-the-art MM performance while preserving the computational efficiency of classical MM, enabling MD in established engines like GROMACS and OpenMM. The approach demonstrates transferability to large biomolecules, extensibility to novel chemistries (including radicals), and compatibility with multiple nonbonded schemes, with robustness verified on proteins and viruses and favorable resource requirements compared to E(3) equivariant models. Overall, Grappa provides a practical, interpretable, and scalable framework for parameterizing MM force fields across broad chemical space at near-MM costs.

Abstract

Simulating large molecular systems over long timescales requires force fields that are both accurate and efficient. In recent years, E(3) equivariant neural networks have lifted the tension between computational efficiency and accuracy of force fields, but they are still several orders of magnitude more expensive than established molecular mechanics (MM) force fields. Here, we propose Grappa, a machine learning framework to predict MM parameters from the molecular graph, employing a graph attentional neural network and a transformer with symmetry-preserving positional encoding. The resulting Grappa force field outperformstabulated and machine-learned MM force fields in terms of accuracy at the same computational efficiency and can be used in existing Molecular Dynamics (MD) engines like GROMACS and OpenMM. It predicts energies and forces of small molecules, peptides, RNA and - showcasing its extensibility to uncharted regions of chemical space - radicals at state-of-the-art MM accuracy. We demonstrate Grappa's transferability to macromolecules in MD simulations from a small fast folding protein up to a whole virus particle. Our force field sets the stage for biomolecular simulations closer to chemical accuracy, but with the same computational cost as established protein force fields.
Paper Structure (38 sections, 27 equations, 12 figures, 7 tables)

This paper contains 38 sections, 27 equations, 12 figures, 7 tables.

Figures (12)

  • Figure 1: Grappa predicts MM parameters in two steps. First, atom embeddings are predicted from the molecular graph with a graph neural network. Then, transformers with symmetric positional encoding followed by permutation invariant pooling maps the embeddings to MM parameters with desired permutation symmetries. Once the MM parameters are predicted, the potential energy surface can be evaluated with MM-efficiency for different spatial conformations.
  • Figure 2: Architecture of the symmetric transformer: Atom embeddings are equipped with a permutation invariant positional encoding determined by the subgraph they represent. They are then passed through $n=3$ permutation equivariant transformer layers, symmetry-pooled and mapped to the possible range of the respective parameter.
  • Figure 3: Grappa predicts one set of parameters per molecule. With the MM energy functional (Eq. \ref{['eq:mm_energy']}), the parameters can be mapped to energies and forces of given states, whose deviation from the ground truth is minimized during training. State-specific quantities are represented in green, molecule-specific quantities are represented in grey.
  • Figure 4: Comparison of energy predictions of (a)Grappa-1.3 and the established force fields (b)Gaff-2.11, ff99SB-ILDN and RNA.OL3 for test molecules from Espaloma's SPICE-Pubchem, SPICE-Dipeptide and RNA-Trinucleotide datasets; force predictions are depicted at \ref{['fig:gradient_scatter']}. (c) The first principal components $u_1$ and $u_2$ of predicted atom embeddings from the Espaloma test dataset can be related to a combination of the main group and period in the periodic table of elements. Lines of constant main group or period are represented by approximate diagonals in latent space.
  • Figure 5: (a) The protein ubiquitin with color-coded sequence position. (c) The mean C-alpha root mean square deviation (RMSD) and its 25th and 75th percentile of 1000 random pairs of frames that are separated by the time difference $\Delta t$. (b) C-alpha RMSD from the initial state during MD simulation of ubiquitin in water with Grappa.
  • ...and 7 more figures