Table of Contents
Fetching ...

Anticipating the Selectivity of Intramolecular Cyclization Reaction Pathways with Neural Network Potentials

Nicholas Casetti, Dylan Anstine, Olexandr Isayev, Connor W. Coley

TL;DR

The paper tackles the challenge of exploring mechanistic pathways for complex cyclizations that involve multiple concurrent bond changes, where exhaustive graph-based enumeration can be intractable. It introduces REVAMP, a framework that integrates graph-based intermediate generation, stereoenumeration, fast ML screening with MPNNs, a reactive neural network potential (AIMNet2-rxn) for kinetic and thermodynamic assessment, and targeted DFT refinement to build mechanistic networks efficiently. Key results show that AIMNet2-rxn can reproduce DFT barrier heights and transition-state geometries with high fidelity, predict stereochemical preferences in intramolecular Diels-Alder reactions, and retrospectively validate key steps in natural product syntheses (e.g., salvinorin A intermediate and endiandric acid C). The approach yields actionable, cost-effective insights for natural product synthesis planning, supported by open-source code and explicit discussion of current limitations and future directions to broaden chemical space with newer NN potentials such as OMol25 and AIMNet2 variants.

Abstract

Reaction mechanism search tools have demonstrated the ability to provide insights into likely products and rate-limiting steps of reacting systems. However, reactions involving several concerted bond changes - as can be found in many key steps of natural product synthesis - can complicate the search process. To mitigate these complications, we present a mechanism search strategy particularly suited to help expedite exploration of an exemplary family of such complex reactions, cyclizations. We provide a cost-effective strategy for identifying relevant elementary reaction steps by combining graph-based enumeration schemes and machine learning techniques for intermediate filtering. Key to this approach is our use of a neural network potential (NNP), AIMNet2-rxn, for computational evaluation of each candidate reaction pathway. In this article, we evaluate the NNP's ability to estimate activation energies, demonstrate the correct anticipation of stereoselectivity, and recapitulate complex enabling steps in natural product synthesis.

Anticipating the Selectivity of Intramolecular Cyclization Reaction Pathways with Neural Network Potentials

TL;DR

The paper tackles the challenge of exploring mechanistic pathways for complex cyclizations that involve multiple concurrent bond changes, where exhaustive graph-based enumeration can be intractable. It introduces REVAMP, a framework that integrates graph-based intermediate generation, stereoenumeration, fast ML screening with MPNNs, a reactive neural network potential (AIMNet2-rxn) for kinetic and thermodynamic assessment, and targeted DFT refinement to build mechanistic networks efficiently. Key results show that AIMNet2-rxn can reproduce DFT barrier heights and transition-state geometries with high fidelity, predict stereochemical preferences in intramolecular Diels-Alder reactions, and retrospectively validate key steps in natural product syntheses (e.g., salvinorin A intermediate and endiandric acid C). The approach yields actionable, cost-effective insights for natural product synthesis planning, supported by open-source code and explicit discussion of current limitations and future directions to broaden chemical space with newer NN potentials such as OMol25 and AIMNet2 variants.

Abstract

Reaction mechanism search tools have demonstrated the ability to provide insights into likely products and rate-limiting steps of reacting systems. However, reactions involving several concerted bond changes - as can be found in many key steps of natural product synthesis - can complicate the search process. To mitigate these complications, we present a mechanism search strategy particularly suited to help expedite exploration of an exemplary family of such complex reactions, cyclizations. We provide a cost-effective strategy for identifying relevant elementary reaction steps by combining graph-based enumeration schemes and machine learning techniques for intermediate filtering. Key to this approach is our use of a neural network potential (NNP), AIMNet2-rxn, for computational evaluation of each candidate reaction pathway. In this article, we evaluate the NNP's ability to estimate activation energies, demonstrate the correct anticipation of stereoselectivity, and recapitulate complex enabling steps in natural product synthesis.

Paper Structure

This paper contains 15 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: Overview of the methodology of REVAMP. REVAMP performs a b4f4 enumeration, including the use of stereoenumeration, and handles the larger resulting pool of intermediates by progressive filtering techniques using a combination of message passing neural networks (MPNNs) and a neural network potential (NNP).
  • Figure 2: A. Two cyclization reaction mechanisms as explored by mita_prediction_2022mita_prediction_2022. Using REVAMP with AIMNet2-rxn finds transition states for all 6 mechanistic steps whereas REVAMP with GFN2-xTB misses an 8$\pi$ electrocyclization (6 $\to$ 7). B. Total wall time taken to perform kinetic feasibility calculations for AIMNet2-rxn and GFN2-xTB. C. Barrier heights as calculated by AIMNet2-rxn and GFN2-xTB compared to DFT. MAEs are calculated without circled outliers. Error bars are the sum of the standard deviation of the energies of the reactant and transition state between three AIMNet2-rxn model seeds. D. Transition state RMSDs calculated between AIMNet2-rxn and DFT. The highest RMSD transition state is superimposed on the DFT transition state
  • Figure 3: Successful prediction of intramolecular Diels-Alder stereoselectivity. A,C. The chemical reaction network returned by REVAMP starting from 9 and 12 respectively where arrow color represents barrier height and node color represents intermediate energy relative to the reactant. Larger nodes correspond to species along the reaction pathway. B. The cis and trans products of 9 are labeled with similar barriers matching experimental results of only weak preference for the trans stereoisomer. D. The extra steric hindrance associated with 12 leads REVAMP to prune the cis product on the basis of AIMNet2-rxn predictions, recapitulating the experimentally-observed selectivity of this reaction. The energies of each state are reported in kcal/mol normalized to the reactants.
  • Figure 4: Analysis of the synthesis of salvinorin A. A. The full synthesis route from zimdars_protectinggroupfree_2021zimdars_protectinggroupfree_2021; we apply REVAMP to evaluate the key step boxed in red through a mechanism search. B. The chemical reaction network returned by REVAMP starting from 15 where arrow color represents barrier height and node color represents intermediate energy relative to the reactant. Larger nodes correspond to species along the reaction pathway. C. The lowest barrier step found by REVAMP is consistent with the synthesis route from zimdars_protectinggroupfree_2021. The energies of each state are reported in kcal/mol normalized to the reactants.
  • Figure 5: Analysis of the synthesis of endiandric acid C. A. The chemical reaction network returned by REVAMP starting from 17 where arrow color represents barrier height and node color represents intermediate energy relative to the reactant. Larger nodes correspond to species along the reaction pathway B. A kinetically favorable pathway identified by REVAMP corresponds to the synthesis performed by nicolaou_endiandric_1982 involving multiple stereoselective cyclizations with many simultaneous bond breaking/forming events. The energies of each state are reported in kcal/mol normalized to the reactants. C. The AIMNet2-rxn and DFT calculated barriers for the favored electrocyclizations along with their less favored stereoisomers. A value of N/A means a transition state wasn't localized for the given reaction. The barriers are reported in kcal/mol normalized to the immediate precursor.