Neural Network Surrogates for Free Energy Computation of Complex Chemical Systems
Wasut Pornpatcharapong
TL;DR
This work addresses the bottleneck of requiring analytical Jacobians for CVs in GPR-based free energy reconstructions by introducing a neural-network surrogate that learns CVs directly from Cartesian coordinates and uses automatic differentiation to obtain Jacobians. The framework is validated on MgCl2 ion pairing with two CVs: a simple distance $d$ and a complex coordination-number $C$, demonstrating accurate CV predictions and Jacobians, with Jacobian errors following a near-Gaussian distribution favorable for GPR. The findings show that CVs can be learned without explicit analytic forms, enabling gradient-based free energy methods to incorporate complex or machine-learned CVs and scale to high-dimensional CV spaces, increasing applicability to biochemistry and materials simulations. The work lays the groundwork for integrating a full GPR pipeline with NN-derived CVs and applying it to large-scale, chemically important systems.
Abstract
Free energy reconstruction methods such as Gaussian Process Regression (GPR) require Jacobians of the collective variables (CVs), a bottleneck that restricts the use of complex or machine-learned CVs. We introduce a neural network surrogate framework that learns CVs directly from Cartesian coordinates and uses automatic differentiation to provide Jacobians, bypassing analytical forms. On an MgCl2 ion-pairing system, our method achieved high accuracy for both a simple distance CV and a complex coordination-number CV. Moreover, Jacobian errors also followed a near-Gaussian distribution, making them suitable for GPR pipelines. This framework enables gradient-based free energy methods to incorporate complex and machine-learned CVs, broadening the scope of biochemistry and materials simulations.
