Lifelong Machine Learning Potentials for Chemical Reaction Network Explorations
Marco Eckhoff, Markus Reiher
TL;DR
This work addresses the high computational cost of exploring chemical reaction networks by introducing lifelong machine learning potentials (lMLPs) that continually adapt to new data via lifelong adaptive data selection (lADS). Leveraging universal eeACSF descriptors, ensemble uncertainty, and a Δ-learning approach with PBE-GFN2, the authors demonstrate that lMLPs can achieve chemical accuracy in rolling CRN explorations while dramatically reducing retraining costs. Compared with conventional iterative learning, continual learning with lADS preserves prior knowledge and efficiently integrates new data, yielding substantial improvements in energy and force prediction accuracy and enabling on-the-fly CRN exploration. The results suggest a practical pathway toward reliable, uncertainty-aware CRN predictions and adaptive refinement using limited, targeted high-level calculations.
Abstract
Recent developments in computational chemistry facilitate the automated quantum chemical exploration of chemical reaction networks for the in-silico prediction of synthesis pathways, yield, and selectivity. However, the underlying quantum chemical energy calculations require vast computational resources, limiting these explorations severely in practice. Machine learning potentials (MLPs) offer a solution to increase computational efficiency, while retaining the accuracy of reliable first-principles data used for their training. Unfortunately, MLPs will be limited in their generalization ability within chemical (reaction) space, if the underlying training data are not representative for a given application. Within the framework of automated reaction network exploration, where new reactants or reagents composed of any elements from the periodic table can be introduced, this lack of generalizability will be the rule rather than the exception. Here, we therefore evaluate the benefits of the lifelong MLP concept in this context. Lifelong MLPs push their adaptability by efficient continual learning of additional data. We propose an improved learning algorithm for lifelong adaptive data selection yielding efficient integration of new data while previous expertise is preserved. In this way, we can reach chemical accuracy in reaction search trials.
