Table of Contents
Fetching ...

${\it Asparagus}$: A Toolkit for Autonomous, User-Guided Construction of Machine-Learned Potential Energy Surfaces

Kai Töpfer, Luis Itza Vazquez-Salazar, Markus Meuwly

TL;DR

Asparagus addresses the fragmentation and reproducibility gaps in ML-based PES construction by delivering a modular, Python-based workflow that unifies data sampling, ab initio interfaces, ML training, evaluation, and downstream application. The platform supports MD, MC, normal-mode-based sampling, metadynamics, and path-based analyses, coupled with PhysNet and PaiNN architectures and ASE/CHARMM interfaces, all governed by a reproducible JSON-config system and a centralized database. Demonstrations across ammonia conformations, organometallic reaction pathways, and surface diffusion illustrate the framework’s ability to generate high-quality ML-PES, validate them against reference data, and deploy them in MD/MD-like simulations. The practical impact lies in lowering entry barriers, improving reproducibility, and enabling efficient, accurate atomistic simulations across diverse chemical space, with open-source licensing and plans for future enhancements such as uncertainty quantification and transfer learning.

Abstract

With the establishment of machine learning (ML) techniques in the scientific community, the construction of ML potential energy surfaces (ML-PES) has become a standard process in physics and chemistry. So far, improvements in the construction of ML-PES models have been conducted independently, creating an initial hurdle for new users to overcome and complicating the reproducibility of results. Aiming to reduce the bar for the extensive use of ML-PES, we introduce ${\it Asparagus}$, a software package encompassing the different parts into one coherent implementation that allows an autonomous, user-guided construction of ML-PES models. ${\it Asparagus}$ combines capabilities of initial data sampling with interfaces to ${\it ab initio}$ calculation programs, ML model training, as well as model evaluation and its application within other codes such as ASE or CHARMM. The functionalities of the code are illustrated in different examples, including the dynamics of small molecules, the representation of reactive potentials in organometallic compounds, and atom diffusion on periodic surface structures. The modular framework of ${\it Asparagus}$ is designed to allow simple implementations of further ML-related methods and models to provide constant user-friendly access to state-of-the-art ML techniques.

${\it Asparagus}$: A Toolkit for Autonomous, User-Guided Construction of Machine-Learned Potential Energy Surfaces

TL;DR

Asparagus addresses the fragmentation and reproducibility gaps in ML-based PES construction by delivering a modular, Python-based workflow that unifies data sampling, ab initio interfaces, ML training, evaluation, and downstream application. The platform supports MD, MC, normal-mode-based sampling, metadynamics, and path-based analyses, coupled with PhysNet and PaiNN architectures and ASE/CHARMM interfaces, all governed by a reproducible JSON-config system and a centralized database. Demonstrations across ammonia conformations, organometallic reaction pathways, and surface diffusion illustrate the framework’s ability to generate high-quality ML-PES, validate them against reference data, and deploy them in MD/MD-like simulations. The practical impact lies in lowering entry barriers, improving reproducibility, and enabling efficient, accurate atomistic simulations across diverse chemical space, with open-source licensing and plans for future enhancements such as uncertainty quantification and transfer learning.

Abstract

With the establishment of machine learning (ML) techniques in the scientific community, the construction of ML potential energy surfaces (ML-PES) has become a standard process in physics and chemistry. So far, improvements in the construction of ML-PES models have been conducted independently, creating an initial hurdle for new users to overcome and complicating the reproducibility of results. Aiming to reduce the bar for the extensive use of ML-PES, we introduce , a software package encompassing the different parts into one coherent implementation that allows an autonomous, user-guided construction of ML-PES models. combines capabilities of initial data sampling with interfaces to calculation programs, ML model training, as well as model evaluation and its application within other codes such as ASE or CHARMM. The functionalities of the code are illustrated in different examples, including the dynamics of small molecules, the representation of reactive potentials in organometallic compounds, and atom diffusion on periodic surface structures. The modular framework of is designed to allow simple implementations of further ML-related methods and models to provide constant user-friendly access to state-of-the-art ML techniques.
Paper Structure (18 sections, 17 equations, 5 figures)

This paper contains 18 sections, 17 equations, 5 figures.

Figures (5)

  • Figure 1: Scheme representation of the Asparagus classes (panel A) and a workflow chart (panel B) that represents class details and procedures for the construction of a ML-PES in Asparagus.
  • Figure 2: Bond potential for a single N-H bond elongation in ammonia predicted by the PBE reference method (dashed black line) and PhysNet model potentials trained on reference data from (A) MD simulation, (B) metadynamics and (C) normal mode scanning sampling methods. The hollow grey bars indicate N-H bond distance distribution in the respective data sets. Panel D shows the potential curve along the umbrella motion in ammonia predicted by the PBE reference method (dashed black line) and the PhysNet model potentials trained on reference data from MD simulation (red line), metadynamics (blue dash-dotted line) and normal mode scanning sampling methods (green dotted line).
  • Figure 3: Panel A: Total energy sequence of a $NVE$ simulation for ammonia in water using the classical force field CGenFF and the QM(ML)/MM approach with a trained PhysNet or PaiNN model of ammonia trained on metadynamics sampled data (see Listing \ref{['code:meta_nh3']}). The energies are arbitrarily shifted for a better comparison and the dashed black line marks the average energy, respectively. Panel B: Radial distribution function between ammonia's nitrogen and water oxygen atoms in $NpT$ simulation using different potential models.
  • Figure 4: Organometallic reaction. Panel A shows the correlation plot for the prediction energy in the test subset. Panel B displays the minimum energy path obtained from ab initio calculations and, with the NN model, insight the panel structures of the equilibrium structures. Complementary panel C shows the change in the distance between the carbon in the alkene and the hydrogen atom bonded to the metal, as well as the angle between the C-H-Co atoms.
  • Figure 5: Panel A: Minimum energy path (MEP) for a gold atom diffusion on Al(100) surface (hollow $\rightarrow$ bridge $\rightarrow$ hollow) computed by NEB simulation using PBE/PAW level of theory (blue marker) and a PaiNN model potential (red marker) trained on Au/Al(100) reference data set. The purple line shows the energy prediction of the PaiNN model potential along the MEP structures from NEB with PBE/PAW. Panel B shows the Au atom height from the ideal surface layer height and sketches of MEP.