Table of Contents
Fetching ...

Machine-Learned Potentials for Solvation Modeling

Roopshree Banchode, Surajit Das, Shampa Raghunathan, Raghunathan Ramakrishnan

TL;DR

This review surveys machine-learned potentials (MLPs) for solvation modeling, highlighting how ML approaches can achieve near first-principles accuracy at MD-scale efficiency. It connects PES fitting, many-body expansions, and symmetry considerations to practical architectures, from descriptor-based NN-MLPs to gradient-domain and force-only models, including both end-to-end and graph-based (MP-GNN) variants. The authors categorize models, discuss training strategies (including Δ-ML and active learning), and illustrate the breadth of applications through case studies spanning explicit, implicit, and hybrid solvation, plus direct solvation-property predictions. They also address open challenges—transferability, long-range electrostatics, and sampling—and outline directions toward robust, transferable solvation-aware ML frameworks that can impact catalysis, interfacial chemistry, and biomolecular solvation. Overall, the work charts a path for integrating physically grounded ML potentials into solvated systems, enabling accurate, scalable simulations and accelerated discovery in chemistry and materials science.

Abstract

Solvent environments play a central role in determining molecular structure, energetics, reactivity, and interfacial phenomena. However, modeling solvation from first principles remains difficult due to the complex interplay of interactions and unfavorable computational scaling of first-principles treatment with system size. Machine-learned potentials (MLPs) have recently emerged as efficient surrogates for quantum chemistry methods, offering first-principles accuracy at greatly reduced computational cost. MLPs approximate the underlying potential energy surface, enabling efficient computation of energies and forces in solvated systems, and are capable of accounting for effects such as hydrogen bonding, long-range polarization, and conformational changes. This review surveys the development and application of MLPs in solvation modeling. We summarize the theoretical basis of MLP-based energy and force predictions and present a classification of MLPs based on training targets, model types, and design choices related to architectures, descriptors, and training protocols. Integration into established solvation workflows is discussed, with case studies spanning small molecules, interfaces, and reactive systems. We conclude by outlining open challenges and future directions toward transferable, robust, and physically grounded MLPs for solvation-aware atomistic modeling.

Machine-Learned Potentials for Solvation Modeling

TL;DR

This review surveys machine-learned potentials (MLPs) for solvation modeling, highlighting how ML approaches can achieve near first-principles accuracy at MD-scale efficiency. It connects PES fitting, many-body expansions, and symmetry considerations to practical architectures, from descriptor-based NN-MLPs to gradient-domain and force-only models, including both end-to-end and graph-based (MP-GNN) variants. The authors categorize models, discuss training strategies (including Δ-ML and active learning), and illustrate the breadth of applications through case studies spanning explicit, implicit, and hybrid solvation, plus direct solvation-property predictions. They also address open challenges—transferability, long-range electrostatics, and sampling—and outline directions toward robust, transferable solvation-aware ML frameworks that can impact catalysis, interfacial chemistry, and biomolecular solvation. Overall, the work charts a path for integrating physically grounded ML potentials into solvated systems, enabling accurate, scalable simulations and accelerated discovery in chemistry and materials science.

Abstract

Solvent environments play a central role in determining molecular structure, energetics, reactivity, and interfacial phenomena. However, modeling solvation from first principles remains difficult due to the complex interplay of interactions and unfavorable computational scaling of first-principles treatment with system size. Machine-learned potentials (MLPs) have recently emerged as efficient surrogates for quantum chemistry methods, offering first-principles accuracy at greatly reduced computational cost. MLPs approximate the underlying potential energy surface, enabling efficient computation of energies and forces in solvated systems, and are capable of accounting for effects such as hydrogen bonding, long-range polarization, and conformational changes. This review surveys the development and application of MLPs in solvation modeling. We summarize the theoretical basis of MLP-based energy and force predictions and present a classification of MLPs based on training targets, model types, and design choices related to architectures, descriptors, and training protocols. Integration into established solvation workflows is discussed, with case studies spanning small molecules, interfaces, and reactive systems. We conclude by outlining open challenges and future directions toward transferable, robust, and physically grounded MLPs for solvation-aware atomistic modeling.

Paper Structure

This paper contains 50 sections, 158 equations, 12 figures, 7 tables.

Figures (12)

  • Figure 1: Distribution of review articles cited in the present work, categorized by thematic focus (labeled using BibTeX-style citation keys, lastnameYYYYfirstword): traditional solvation modeling based on ab initio and force-field methods (blue), general ML strategies for molecular and materials modeling (green), and ML reviews with explicit discussions on solvation applications (orange). As of yet, there is no focused review solely on "MLPs for solvation modeling". Three reviews wan2024constructionyang2024machinekeith2021combining that include dedicated sections on solvation modeling are highlighted in orange. This classification highlights the emerging emphasis on MLPs in the context of solvation modeling.
  • Figure 2: Four levels of solvation modeling are illustrated using caffeine as the example solute: (1) Explicit solvation, where the solute is fully immersed in bulk water, typically modeled with periodic boundary conditions; here, a 10 Å cubic water box was used for illustration; (2) Implicit solvation, where the solute resides in a molecule-shaped cavity within a polarizable dielectric continuum; (3) Microsolvation, a hybrid approach in which the solute is surrounded by a small number of explicit solvent molecules, with vacuum beyond; and (4) Cluster–continuum, another hybrid approach where the solute–solvent cluster is embedded within a continuum model, capturing both local explicit interactions and long-range dielectric effects.
  • Figure 3: Concept map of MLP-driven solvation modeling, as presented in this review. Solvation modeling is organized around computational and solvation paradigms and interfaces with machine learning to produce rapid, accurate, and transferable data-driven force fields.
  • Figure 4: Illustration of a water molecule showing atomic position vectors, $\lbrace {\bf r}_i \rbrace$, and the Cartesian components of the force vector, ${\bf F}_3$, acting on a selected atom. All vectors are shown in a fixed coordinate system.
  • Figure 5: Illustration of $\Delta$-ML for predicting NMR chemical shifts in a prototypical molecule, adapted from Ref. gupta2021revving. The model is trained to learn the difference between a baseline prediction (DFT with a small basis set using semiempirical geometries) and a higher-level reference (DFT with a large basis set using geometries optimized at a comparable level).
  • ...and 7 more figures