Hierarchical quantum embedding by machine learning for large molecular assemblies
Moritz Bensberg, Marco Eckhoff, Raphael T. Husistein, Matthew S. Teynor, Valentina Sora, William Bro-Jørgensen, F. Emil Thomasen, Anders Krogh, Kresten Lindorff-Larsen, Gemma C. Solomon, Thomas Weymuth, Markus Reiher
TL;DR
This work develops a two-level hierarchical QM/QM/MM framework where strategically defined quantum cores within a large QM region are refined using Huzinaga-type projection-based embedding, and the resulting high-accuracy energies are transferred into an ML/MM potential via transfer learning. The approach enables accurate, scalable descriptions of large molecular assemblies and enables binding free energy calculations for protein–ligand systems through alchemical free energy and non-equilibrium switching, with end-state corrections validated against experiment. The study demonstrates that quantum-core refinement induces modest energy shifts but improves the PES fidelity, and that energy-derivative information (forces) substantially enhances the efficiency and accuracy of the transfer-learning step. Overall, this hierarchical embedding plus ML refinement provides a practical route to incorporate high-level electronic structure information into large biomolecular simulations, with clear paths toward automation and broader applicability.
Abstract
We present a quantum-in-quantum embedding strategy coupled to machine learning potentials to improve on the accuracy of quantum-classical hybrid models for the description of large molecules. In such hybrid models, relevant structural regions (such as those around reaction centers or pockets for binding of host molecules) can be described by a quantum model that is then embedded into a classical molecular-mechanics environment. However, this quantum region may become so large that only approximate electronic structure models are applicable. To then restore accuracy in the quantum description, we here introduce the concept of quantum cores within the quantum region that are amenable to accurate electronic structure models due to their limited size. Huzinaga-type projection-based embedding, for example, can deliver accurate electronic energies obtained with advanced electronic structure methods. The resulting total electronic energies are then fed into a transfer learning approach that efficiently exploits the higher-accuracy data to improve on a machine learning potential obtained for the original quantum-classical hybrid approach. We explore the potential of this approach in the context of a well-studied protein-ligand complex for which we calculate the free energy of binding using alchemical free energy and non-equilibrium switching simulations.
