Table of Contents
Fetching ...

React-OT: Optimal Transport for Generating Transition State in Chemical Reactions

Chenru Duan, Guan-Horng Liu, Yuanqi Du, Tianrong Chen, Qiyuan Zhao, Haojun Jia, Carla P. Gomes, Evangelos A. Theodorou, Heather J. Kulik

TL;DR

The remarkable accuracy and rapid inference of React-OT will be highly useful when integrated with the current high-throughput TS search workflow, which will facilitate the exploration of chemical reactions with unknown mechanisms.

Abstract

Transition states (TSs) are transient structures that are key in understanding reaction mechanisms and designing catalysts but challenging to be captured in experiments. Alternatively, many optimization algorithms have been developed to search for TSs computationally. Yet the cost of these algorithms driven by quantum chemistry methods (usually density functional theory) is still high, posing challenges for their applications in building large reaction networks for reaction exploration. Here we developed React-OT, an optimal transport approach for generating unique TS structures from reactants and products. React-OT generates highly accurate TS structures with a median structural root mean square deviation (RMSD) of 0.053Å and median barrier height error of 1.06 kcal/mol requiring only 0.4 second per reaction. The RMSD and barrier height error is further improved by roughly 25\% through pretraining React-OT on a large reaction dataset obtained with a lower level of theory, GFN2-xTB. We envision that the remarkable accuracy and rapid inference of React-OT will be highly useful when integrated with the current high-throughput TS search workflow. This integration will facilitate the exploration of chemical reactions with unknown mechanisms.

React-OT: Optimal Transport for Generating Transition State in Chemical Reactions

TL;DR

The remarkable accuracy and rapid inference of React-OT will be highly useful when integrated with the current high-throughput TS search workflow, which will facilitate the exploration of chemical reactions with unknown mechanisms.

Abstract

Transition states (TSs) are transient structures that are key in understanding reaction mechanisms and designing catalysts but challenging to be captured in experiments. Alternatively, many optimization algorithms have been developed to search for TSs computationally. Yet the cost of these algorithms driven by quantum chemistry methods (usually density functional theory) is still high, posing challenges for their applications in building large reaction networks for reaction exploration. Here we developed React-OT, an optimal transport approach for generating unique TS structures from reactants and products. React-OT generates highly accurate TS structures with a median structural root mean square deviation (RMSD) of 0.053Å and median barrier height error of 1.06 kcal/mol requiring only 0.4 second per reaction. The RMSD and barrier height error is further improved by roughly 25\% through pretraining React-OT on a large reaction dataset obtained with a lower level of theory, GFN2-xTB. We envision that the remarkable accuracy and rapid inference of React-OT will be highly useful when integrated with the current high-throughput TS search workflow. This integration will facilitate the exploration of chemical reactions with unknown mechanisms.
Paper Structure (12 sections, 8 equations, 18 figures, 4 tables)

This paper contains 12 sections, 8 equations, 18 figures, 4 tables.

Figures (18)

  • Figure 1: Overview of the diffusion model and optimal transport framework for generating TS.a. Learning the joint distribution of structures in elementary reactions (reactant in red, TS in yellow, and product in blue). A forward diffusion process brings the joint distribution at $t=T$ to independent normal distribution at $t=0$. Backward, an object-aware SE(3) GNN is trained with denoising objective to recover the normal distribution to the original joint distribution. b. Stochastic inference with inpainting in OA-ReactDiff. Starting with samples drawn from normal distribution, the trained GNN is applied to denoise the reactant, TS, and product. A diffusion process on reactant and product is combined with the denoising process to ensure the end-point reactant and product at $t=T$ are the same as true reactant and product. c. Deterministic inference with React-OT. Both the reactant and product are unchanged throughout the entire process from $t=0$ to $t=T$. The linear interpolation of reactant and product is provided as the initial guess structure at $t=0$, followed by optimal (i.e., linear) transport to the final TS. Atoms are colored as follows: C for gray; N for blue, O for red, and H for white.
  • Figure 1: Reaction Network of $\gamma$-ketohydroperoxide (KHP). A two-step reaction network of KHP generated by the Yet Another Reaction Program. Numbers denoted by red/black refer to activation energies computed on DFT-optimized/React-OT-generated transition states.
  • Figure 2: Structural and energetic performance of diffusion and optimal transport generated TS structures.a. Cumulative probability for structure root mean square deviation (RMSD) (left) and absolute energy error ($|\Delta \mathit{E}_\mathrm{TS}|$) (right) between the true and generated TS on 1,073 set-aside test reactions. single-shot OA-ReactDiffOAReactDiff (blue), 40-shot OA-ReactDiff with recommender (red), single-shot TSDiff2DTSDiff (green), and React-OT TS (orange) are shown. Both RMSD and $|\Delta \mathit{E}_\mathrm{TS}|$ are displaced in log scale for visibility of low error regime. b. Reference TS structure, OA-ReactDiff TS sample (red), and React-OT structure (orange) for select reactions. RMSD and $|\Delta \mathit{E}_\mathrm{TS}|$ for OA-ReactDiff and React-OT structures are shown in text with their corresponding color. Atoms in the reference TS are colored as follows: C for gray; N for blue, O for red, and H for white. c. Histogram (gray, left y axis) and cumulative probability (blue, right y axis) showing the difference of RMSD (left) and $|\Delta \mathit{E}_\mathrm{TS}|$ (right) between OA-ReactDiff recommended and React-OT structures compared to reference TS. Negative $\Delta$RMSD or $\Delta|\Delta \mathit{E}_\mathrm{TS}|$ suggests React-OT structure is of higher quality. A box plot (blue) for $\Delta$RMSD and $\Delta|\Delta \mathit{E}_\mathrm{TS}|$ is shown above the histogram, correspondingly. A dashed vertical line is shown for no deviation between two structures. d. Inference time in seconds for single-shot OA-ReactDiff (blue), 40-shot OA-ReactDiff with recommender (red), and React-OT (orange). The y axis is displaced in log scale for visibility of the extremely low inference time for React-OT.
  • Figure 2: RMSD between the true TS different initial guesses. Linear interpolation between reactants and products (blue) and samples from a Gaussian distribution (red).
  • Figure 3: Performance of React-OT with respect to the number of function evaluation (nfe).a. Distribution of RMSD (left) and $|\Delta \mathit{E}_\mathrm{TS}|$ (right) between the true and generated TS on 1,073 set-aside test reactions, where the mean is shown in pink and median is shown in yellow. The first and third quarter is bounded by a green box. b. Absolute difference in RMSD (top) and $|\Delta \mathit{E}_\mathrm{TS}|$ (bottom) for generated structures at nfe=6 and nfe=200. Structures where nfe=200 gives better quality is shown in blue, otherwise shown in red. A threshold below which the comparison is not chemically meaningful is shown (dashed vertical line). Reference TS structure and React-OT generated structure at nfe=6 (pink) and nfe=200 (skyblue) for a select reaction. Atoms in the reference TS are colored as follows: C for gray; N for blue, O for red, and H for white.
  • ...and 13 more figures