Table of Contents
Fetching ...

MACE-OFF: Transferable Short Range Machine Learning Force Fields for Organic Molecules

Dávid Péter Kovács, J. Harry Moore, Nicholas J. Browning, Ilyes Batatia, Joshua T. Horton, Yixuan Pu, Venkat Kapil, William C. Witt, Ioan-Bogdan Magdău, Daniel J. Cole, Gábor Csányi

TL;DR

MACE-OFF introduces purely short-range, transferable ML force fields for neutral organic systems, trained on high-level quantum data, and demonstrates accurate predictions across gas, liquid, crystal, and biomolecular contexts. The approach combines a two-layer, equivariant graph neural network with a local cutoff and a maximum body-order of $\nu=3$, achieving per-atom energies and inter-molecular forces near chemical accuracy on diverse tasks, including dihedral scans and condensed-phase properties. Across extended SPICE data, dihedral benchmarks, and biomolecular tests (Ala3, Ala15, Crambin), MACE-OFF delivers accurate energies, structures, spectra, and free-energy surfaces while maintaining substantial computational efficiency on GPUs. The results highlight the potential of local ML potentials to enable first-principles-like simulations at scale, with planned enhancements to include explicit long-range electrostatics and charges for charged and protonated species.

Abstract

Classical empirical force fields have dominated biomolecular simulation for over 50 years. Although widely used in drug discovery, crystal structure prediction, and biomolecular dynamics, they generally lack the accuracy and transferability required for first-principles predictive modeling. In this paper, we introduce MACE-OFF, a series of short range transferable force fields for organic molecules created using state-of-the-art machine learning technology and first-principles reference data computed with a high level of quantum mechanical theory. MACE-OFF demonstrates the remarkable capabilities of short range models by accurately predicting a wide variety of gas and condensed phase properties of molecular systems. It produces accurate, easy-to-converge dihedral torsion scans of unseen molecules, as well as reliable descriptions of molecular crystals and liquids, including quantum nuclear effects. We further demonstrate the capabilities of MACE-OFF by determining free energy surfaces in explicit solvent, as well as the folding dynamics of peptides.Finally, we simulate a fully solvated small protein, observing accurate secondary structure and vibrational spectrum. These developments enable first-principles simulations of molecular systems for the broader chemistry community at high accuracy and relatively low computational cost.

MACE-OFF: Transferable Short Range Machine Learning Force Fields for Organic Molecules

TL;DR

MACE-OFF introduces purely short-range, transferable ML force fields for neutral organic systems, trained on high-level quantum data, and demonstrates accurate predictions across gas, liquid, crystal, and biomolecular contexts. The approach combines a two-layer, equivariant graph neural network with a local cutoff and a maximum body-order of , achieving per-atom energies and inter-molecular forces near chemical accuracy on diverse tasks, including dihedral scans and condensed-phase properties. Across extended SPICE data, dihedral benchmarks, and biomolecular tests (Ala3, Ala15, Crambin), MACE-OFF delivers accurate energies, structures, spectra, and free-energy surfaces while maintaining substantial computational efficiency on GPUs. The results highlight the potential of local ML potentials to enable first-principles-like simulations at scale, with planned enhancements to include explicit long-range electrostatics and charges for charged and protonated species.

Abstract

Classical empirical force fields have dominated biomolecular simulation for over 50 years. Although widely used in drug discovery, crystal structure prediction, and biomolecular dynamics, they generally lack the accuracy and transferability required for first-principles predictive modeling. In this paper, we introduce MACE-OFF, a series of short range transferable force fields for organic molecules created using state-of-the-art machine learning technology and first-principles reference data computed with a high level of quantum mechanical theory. MACE-OFF demonstrates the remarkable capabilities of short range models by accurately predicting a wide variety of gas and condensed phase properties of molecular systems. It produces accurate, easy-to-converge dihedral torsion scans of unseen molecules, as well as reliable descriptions of molecular crystals and liquids, including quantum nuclear effects. We further demonstrate the capabilities of MACE-OFF by determining free energy surfaces in explicit solvent, as well as the folding dynamics of peptides.Finally, we simulate a fully solvated small protein, observing accurate secondary structure and vibrational spectrum. These developments enable first-principles simulations of molecular systems for the broader chemistry community at high accuracy and relatively low computational cost.
Paper Structure (21 sections, 8 equations, 12 figures, 3 tables)

This paper contains 21 sections, 8 equations, 12 figures, 3 tables.

Figures (12)

  • Figure 1: Test set root mean square errors (RMSE). Errors in the MACE-OFF23 models compared to the underlying DFT reference data, highlighting the relative accuracy of the three models. Bottom panels show specifically inter-molecular force errors compared to overall DFT inter-molecular force magnitudes (RMS). Note that for subsets comprising only single molecule configurations (DES370K Monomers, Dipeptides, QMugs, Tripeptides) inter-molecular contributions are expected to be zero. The slight deviation from zero arises because DFT forces do not obey translational and rotational symmetries with sufficient accuracy, while MACE models do.
  • Figure 2: Dihedral benchmark scans. The top panel shows torsion drive data for the TorsionNet-500 dataset, which has a wide chemical diversity (five example molecules are shown). The bottom panel focuses on the torsion angle between two aromatic rings in the biaryl torsion benchmark lahey2020biaryl_torsion which contains 78 molecules (five examples are shown).
  • Figure 3: Infrared (IR) spectrum of a paracetamol molecule. The classical (blue dashed) and quantum (blue solid) IR spectrum at 293 K using the MACE-OFF23(M) model is computed by propagating the system and estimating the time correlation function of the time derivative of the total dipole moment. The experimental data are taken from Ref. nist_webbook.
  • Figure 4: Sublimation enthalphies of molecular crystals. Comparison between predicted sublimation enthalphies of the MACE-OFF23(M) and ANI models and experiment.
  • Figure 5: Molecular liquids. Comparison between MACE-OFF23(M) and ANI-2x with experiment for densities (top) and heats of vaporization (bottom) of condensed phase organic liquids.
  • ...and 7 more figures