Finding Pathways in Reaction Networks guided by Energy Barriers using Integer Linear Programming
Adittya Pal, Rolf Fagerberg, Jakob Lykke Andersen, Peter Dittrich, Daniel Merkle
TL;DR
This work addresses the challenge of finding kinetically plausible synthesis pathways in large reaction networks by modeling the network as a directed hypergraph and formulating pathway search as an integer linear program over integer hyperflows. A linear, physically motivated objective minimizes ∑_{e∈E} f_e (G_e + RT log D) to maximize the overall pathway probability, where G_e are reaction barriers and D = ∑_{i∈E} exp(−G_i/(RT)). The authors introduce an automated pipeline that estimates energy barriers using OpenBabel, xTB, RDKit, ASE NeuralNEB, and Nudged Elastic Band, enabling kinetic annotation of generative networks. The method is demonstrated on a glycolonitrile–NH_3–H_2O network expanded to 44 vertices and 116 hyperedges, yielding multiple high-ranking, structurally distinct pathways to glycine and glycolic acid and illustrating scalability and practicality for large networks.
Abstract
Analyzing synthesis pathways for target molecules in a chemical reaction network annotated with information on the kinetics of individual reactions is an area of active study. This work presents a computational methodology for searching for pathways in reaction networks which is based on integer linear programming and the modeling of reaction networks by directed hypergraphs. Often multiple pathways fit the given search criteria. To rank them, we develop an objective function based on physical arguments maximizing the probability of the pathway. We furthermore develop an automated pipeline to estimate the energy barriers of individual reactions in reaction networks. Combined, the methodology facilitates flexible and kinetically informed pathway investigations on large reaction networks by computational means, even for networks coming without kinetic annotation, such as those created via generative approaches for expanding molecular spaces.
