${\it Asparagus}$: A Toolkit for Autonomous, User-Guided Construction of Machine-Learned Potential Energy Surfaces
Kai Töpfer, Luis Itza Vazquez-Salazar, Markus Meuwly
TL;DR
Asparagus addresses the fragmentation and reproducibility gaps in ML-based PES construction by delivering a modular, Python-based workflow that unifies data sampling, ab initio interfaces, ML training, evaluation, and downstream application. The platform supports MD, MC, normal-mode-based sampling, metadynamics, and path-based analyses, coupled with PhysNet and PaiNN architectures and ASE/CHARMM interfaces, all governed by a reproducible JSON-config system and a centralized database. Demonstrations across ammonia conformations, organometallic reaction pathways, and surface diffusion illustrate the framework’s ability to generate high-quality ML-PES, validate them against reference data, and deploy them in MD/MD-like simulations. The practical impact lies in lowering entry barriers, improving reproducibility, and enabling efficient, accurate atomistic simulations across diverse chemical space, with open-source licensing and plans for future enhancements such as uncertainty quantification and transfer learning.
Abstract
With the establishment of machine learning (ML) techniques in the scientific community, the construction of ML potential energy surfaces (ML-PES) has become a standard process in physics and chemistry. So far, improvements in the construction of ML-PES models have been conducted independently, creating an initial hurdle for new users to overcome and complicating the reproducibility of results. Aiming to reduce the bar for the extensive use of ML-PES, we introduce ${\it Asparagus}$, a software package encompassing the different parts into one coherent implementation that allows an autonomous, user-guided construction of ML-PES models. ${\it Asparagus}$ combines capabilities of initial data sampling with interfaces to ${\it ab initio}$ calculation programs, ML model training, as well as model evaluation and its application within other codes such as ASE or CHARMM. The functionalities of the code are illustrated in different examples, including the dynamics of small molecules, the representation of reactive potentials in organometallic compounds, and atom diffusion on periodic surface structures. The modular framework of ${\it Asparagus}$ is designed to allow simple implementations of further ML-related methods and models to provide constant user-friendly access to state-of-the-art ML techniques.
