Table of Contents
Fetching ...

Computing solvation free energies of small molecules with experimental accuracy

J. Harry Moore, Daniel J. Cole, Gabor Csanyi

TL;DR

The paper introduces MACE-OFF24-SC, a softcore-equipped, transferable machine-learned potential designed for condensed-phase alchemical free energy calculations. By incorporating $\ ext{\\lambda}$-dependent nonbonded scaling and softened two-body interactions, the approach enables stable, rigorous estimation of solvation free energies entirely with ML potentials. Across hydration, octanol solvation, and logP benchmarks, the method achieves sub-chemical accuracy and outperforms several classical forcefields, demonstrating strong transferability to drug-like chemical space. This work suggests ML-based forcefields can deliver ab initio-quality thermodynamics at practical computational cost, with potential to transform drug discovery workflows, while noting current limitations in explicit long-range electrostatics and future avenues for refinement.

Abstract

Free energies play a central role in characterising the behaviour of chemical systems and are among the most important quantities that can be calculated by molecular dynamics simulations. Solvation free energies in various organic solvents, in particular, are well-studied physicochemical properties of drug-like molecules and are commonly used to assess and optimise the accuracy of nonbonded parameters in empirical forcefields, and also as a fast-to-compute surrogate of performance for protein-ligand binding free energy estimation. Machine learned potentials (MLPs) show great promise as more accurate alternatives to empirical forcefields, but are not readily decomposed into physically motivated functional forms, which has thus far rendered them incompatible with standard alchemical free energy methods that manipulate individual pairwise interaction terms. However, since the accuracy of free energy calculations is highly sensitive to the forcefield, this is a key area in which MLPs have the potential to address the shortcomings of empirical forcefields. In this work, we introduce an efficient alchemical free energy protocol that enables calculations of rigorous free energy differences in condensed phase systems modelled entirely by MLPs. Using a pretrained, transferrable, alchemically equipped MLP model, we demonstrate sub-chemical accuracy for the solvation free energies of a wide range of organic molecules.

Computing solvation free energies of small molecules with experimental accuracy

TL;DR

The paper introduces MACE-OFF24-SC, a softcore-equipped, transferable machine-learned potential designed for condensed-phase alchemical free energy calculations. By incorporating -dependent nonbonded scaling and softened two-body interactions, the approach enables stable, rigorous estimation of solvation free energies entirely with ML potentials. Across hydration, octanol solvation, and logP benchmarks, the method achieves sub-chemical accuracy and outperforms several classical forcefields, demonstrating strong transferability to drug-like chemical space. This work suggests ML-based forcefields can deliver ab initio-quality thermodynamics at practical computational cost, with potential to transform drug discovery workflows, while noting current limitations in explicit long-range electrostatics and future avenues for refinement.

Abstract

Free energies play a central role in characterising the behaviour of chemical systems and are among the most important quantities that can be calculated by molecular dynamics simulations. Solvation free energies in various organic solvents, in particular, are well-studied physicochemical properties of drug-like molecules and are commonly used to assess and optimise the accuracy of nonbonded parameters in empirical forcefields, and also as a fast-to-compute surrogate of performance for protein-ligand binding free energy estimation. Machine learned potentials (MLPs) show great promise as more accurate alternatives to empirical forcefields, but are not readily decomposed into physically motivated functional forms, which has thus far rendered them incompatible with standard alchemical free energy methods that manipulate individual pairwise interaction terms. However, since the accuracy of free energy calculations is highly sensitive to the forcefield, this is a key area in which MLPs have the potential to address the shortcomings of empirical forcefields. In this work, we introduce an efficient alchemical free energy protocol that enables calculations of rigorous free energy differences in condensed phase systems modelled entirely by MLPs. Using a pretrained, transferrable, alchemically equipped MLP model, we demonstrate sub-chemical accuracy for the solvation free energies of a wide range of organic molecules.
Paper Structure (26 sections, 7 equations, 11 figures, 4 tables)

This paper contains 26 sections, 7 equations, 11 figures, 4 tables.

Figures (11)

  • Figure 1: Construction of softcore dimer curve for the Br--O diatomic pair by matching the gradient at a set switching point to that given by DFT.
  • Figure 2: Dependence of softcore two-body interactions on $\lambda$ for a C--F diatomic pair learned by MACE-OFF24-SC.
  • Figure 3: Comparison of MACE-OFF24-SC hydration free energies with classical forcefields and experiment. Experimental error bars are shown for those compounds where the value exceeds 0.6 kcal/mol. The shaded region represents a 1 kcal/mol deviation.
  • Figure 4: Convergence of ethane hydration free energy with simulation time.
  • Figure 5: Transition probability matrix between 16 replicas from REMD simulation of phenol in water.
  • ...and 6 more figures