Table of Contents
Fetching ...

Neural Network Surrogates for Free Energy Computation of Complex Chemical Systems

Wasut Pornpatcharapong

TL;DR

This work addresses the bottleneck of requiring analytical Jacobians for CVs in GPR-based free energy reconstructions by introducing a neural-network surrogate that learns CVs directly from Cartesian coordinates and uses automatic differentiation to obtain Jacobians. The framework is validated on MgCl2 ion pairing with two CVs: a simple distance $d$ and a complex coordination-number $C$, demonstrating accurate CV predictions and Jacobians, with Jacobian errors following a near-Gaussian distribution favorable for GPR. The findings show that CVs can be learned without explicit analytic forms, enabling gradient-based free energy methods to incorporate complex or machine-learned CVs and scale to high-dimensional CV spaces, increasing applicability to biochemistry and materials simulations. The work lays the groundwork for integrating a full GPR pipeline with NN-derived CVs and applying it to large-scale, chemically important systems.

Abstract

Free energy reconstruction methods such as Gaussian Process Regression (GPR) require Jacobians of the collective variables (CVs), a bottleneck that restricts the use of complex or machine-learned CVs. We introduce a neural network surrogate framework that learns CVs directly from Cartesian coordinates and uses automatic differentiation to provide Jacobians, bypassing analytical forms. On an MgCl2 ion-pairing system, our method achieved high accuracy for both a simple distance CV and a complex coordination-number CV. Moreover, Jacobian errors also followed a near-Gaussian distribution, making them suitable for GPR pipelines. This framework enables gradient-based free energy methods to incorporate complex and machine-learned CVs, broadening the scope of biochemistry and materials simulations.

Neural Network Surrogates for Free Energy Computation of Complex Chemical Systems

TL;DR

This work addresses the bottleneck of requiring analytical Jacobians for CVs in GPR-based free energy reconstructions by introducing a neural-network surrogate that learns CVs directly from Cartesian coordinates and uses automatic differentiation to obtain Jacobians. The framework is validated on MgCl2 ion pairing with two CVs: a simple distance and a complex coordination-number , demonstrating accurate CV predictions and Jacobians, with Jacobian errors following a near-Gaussian distribution favorable for GPR. The findings show that CVs can be learned without explicit analytic forms, enabling gradient-based free energy methods to incorporate complex or machine-learned CVs and scale to high-dimensional CV spaces, increasing applicability to biochemistry and materials simulations. The work lays the groundwork for integrating a full GPR pipeline with NN-derived CVs and applying it to large-scale, chemically important systems.

Abstract

Free energy reconstruction methods such as Gaussian Process Regression (GPR) require Jacobians of the collective variables (CVs), a bottleneck that restricts the use of complex or machine-learned CVs. We introduce a neural network surrogate framework that learns CVs directly from Cartesian coordinates and uses automatic differentiation to provide Jacobians, bypassing analytical forms. On an MgCl2 ion-pairing system, our method achieved high accuracy for both a simple distance CV and a complex coordination-number CV. Moreover, Jacobian errors also followed a near-Gaussian distribution, making them suitable for GPR pipelines. This framework enables gradient-based free energy methods to incorporate complex and machine-learned CVs, broadening the scope of biochemistry and materials simulations.

Paper Structure

This paper contains 13 sections, 10 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: Prediction of $d$ for the randomized dataset (a) and simulation dataset (b) and distribution of prediction errors.
  • Figure 2: Prediction of the Jacobians (all dimensions) of $d$ for the randomized dataset (a) and simulation dataset (b) and distribution of prediction errors.
  • Figure 3: Prediction of $C$ for the randomized dataset (a) and simulation dataset (b) and distribution of prediction errors.
  • Figure 4: Prediction of the Jacobians (all dimensions) of $C$ for the randomized dataset (a) and simulation dataset (b) and distribution of prediction errors.