Machine Learning Enhanced Calculation of Quantum-Classical Binding Free Energies
Moritz Bensberg, Marco Eckhoff, F. Emil Thomasen, William Bro-Jørgensen, Matthew S. Teynor, Valentina Sora, Thomas Weymuth, Raphael T. Husistein, Frederik E. Knudsen, Anders Krogh, Kresten Lindorff-Larsen, Markus Reiher, Gemma C. Solomon
TL;DR
This work addresses the challenge of accurately predicting binding free energies for protein–drug systems that include transition metals by integrating QM/MM sampling with machine-learned potentials within an automated, distributed workflow. It utilizes an end-to-end pipeline that combines alchemical free energy calculations (MBAR) with non-equilibrium switching (NEQ) corrections, and active-learning to train MM-compatible ML potentials based on QM energies and forces. A key advance is the extension of element-embracing symmetry functions (eeACSFs) to QM/MM data, enabling efficient representation of systems with many elements and the proper treatment of QM/MM interfaces. The approach is demonstrated on MCL1–19G and GRP78–NKP1339, achieving binding free energies in close agreement with experiment for the organic system and showing robust corrections for a Ru-containing complex, highlighting broad applicability and potential impact for accurate, scalable drug-design workflows. The methodology paves the way for systematic improvements via larger QM regions or multilevel QM embedding while maintaining computational efficiency through ML potentials and distributed computing, with $\Delta G_\text{bind}$ predictions facilitating more reliable target prioritization in drug discovery.
Abstract
Binding free energies are a key element in understanding and predicting the strength of protein--drug interactions. While classical free energy simulations yield good results for many purely organic ligands, drugs including transition metal atoms often require quantum chemical methods for an accurate description. We propose a general and automated workflow that samples the potential energy surface with hybrid quantum mechanics/molecular mechanics (QM/MM) calculations and trains a machine learning (ML) potential on the QM energies and forces to enable efficient alchemical free energy simulations. To represent systems including many different chemical elements efficiently and to account for the different description of QM and MM atoms, we propose an extension of element-embracing atom-centered symmetry functions for QM/MM data as an ML descriptor. The ML potential approach takes electrostatic embedding and long-range electrostatics into account. We demonstrate the applicability of the workflow on the well-studied protein--ligand complex of myeloid cell leukemia 1 and the inhibitor 19G and on the anti-cancer drug NKP1339 acting on the glucose-regulated protein 78.
