Table of Contents
Fetching ...

Reliable and Efficient Automated Transition-State Searches with Machine-Learned Interatomic Potentials

Jonah Marks, Jonathon Vandezande, Joseph Gomes

Abstract

Transition-state searches are central to understanding reaction mechanisms, but the high computational cost of density-functional theory (DFT) limits their application in high-throughput catalyst and materials discovery. Machine-learned interatomic potentials (MLIPs) offer near-DFT accuracy at orders-of-magnitude lower cost, yet their reliability for transition-state searches remains underexplored. Here, we systematically benchmark hybrid transition-state-search workflows combining six freely available potentials (MACE-OMol25, UMA-Small, UMA-Medium, eSEN-S, AIMNet2, and GFN2-xTB) with two reaction-path-finding algorithms (the freezing-string method and climbing-image nudged elastic band) across 58 diverse reactions spanning small organics, polymerization chemistry, and transition-metal catalysis. We find that models trained on the Open Molecules 2025 dataset exhibit markedly superior performance, with MACE-OMol25 achieving a 96.6% success rate while requiring fewer than four DFT-gradient evaluations per reaction on organic systems - a 94-96% reduction compared to conventional DFT-based searches. Low-level refinement on the MLIP surface before high-level DFT optimization reduces computational cost three-fold with minimal loss in reliability. For transition-metal systems, UMA-Medium demonstrates promising transferability to in-distribution transition metal complex reactions and out-of-distribution organometallic C-H activation. These results establish MLIP-accelerated workflows as practical tools for automated reaction discovery, enabling near-DFT accuracy at a fraction of traditional expense.

Reliable and Efficient Automated Transition-State Searches with Machine-Learned Interatomic Potentials

Abstract

Transition-state searches are central to understanding reaction mechanisms, but the high computational cost of density-functional theory (DFT) limits their application in high-throughput catalyst and materials discovery. Machine-learned interatomic potentials (MLIPs) offer near-DFT accuracy at orders-of-magnitude lower cost, yet their reliability for transition-state searches remains underexplored. Here, we systematically benchmark hybrid transition-state-search workflows combining six freely available potentials (MACE-OMol25, UMA-Small, UMA-Medium, eSEN-S, AIMNet2, and GFN2-xTB) with two reaction-path-finding algorithms (the freezing-string method and climbing-image nudged elastic band) across 58 diverse reactions spanning small organics, polymerization chemistry, and transition-metal catalysis. We find that models trained on the Open Molecules 2025 dataset exhibit markedly superior performance, with MACE-OMol25 achieving a 96.6% success rate while requiring fewer than four DFT-gradient evaluations per reaction on organic systems - a 94-96% reduction compared to conventional DFT-based searches. Low-level refinement on the MLIP surface before high-level DFT optimization reduces computational cost three-fold with minimal loss in reliability. For transition-metal systems, UMA-Medium demonstrates promising transferability to in-distribution transition metal complex reactions and out-of-distribution organometallic C-H activation. These results establish MLIP-accelerated workflows as practical tools for automated reaction discovery, enabling near-DFT accuracy at a fraction of traditional expense.

Paper Structure

This paper contains 28 sections, 6 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Transition-state-search workflows, green boxes denote steps performed with low-level potentials, while the blue box denotes use of the $\omega$B97X-V/def2-TZVP level of theory.
  • Figure 2: Reactant, transition state, and product geometries of (A) Scandium-Catalyzed Olefin Insertion (B) Iron-Catalyzed Transfer Hydrogenation (C) Platinum-Mediated Metallabenzene Re-arrangement case studies.
  • Figure 3: Transition-state structures for the Rh(III)-catalyzed C–H activation step in oxidative carbonylation of toluene. (A) Reference transition state optimized at the $\omega$B97X-V/def2-TZVP level of theory. (B) Transition state obtained from the UMA-M low-level-refined workflow after DFT refinement. Key atoms defining the reactive core are labeled in panel (A).
  • Figure 4: Native guesses, low-level refined structures, and corresponding converged $\omega$B97X-V/def2-TZVP transition states for reactions 11 (CH$_3$CH$_3$$\rightarrow$ CH$_2$CH$_2$ + H$_2$) and 24 (HCNH$_2$$\rightarrow$ HCN + H$_2$) of the Baker set using FSM with UMA-M. (A) Native MLIP guess for reaction 11. (B) Low-level-refined MLIP structure for reaction 11. (C) Higher-energy saddle point obtained from (B) after DFT refinement. (D) Reference transition state for reaction 11 at the $\omega$B97X-V/def2-TZVP level. (E) Native MLIP guess for reaction 24. (F) Low-level-refined MLIP structure for reaction 24. (G) Higher-energy second-order saddle point obtained from (F) after DFT refinement. (H) Reference transition state for reaction 24 at the $\omega$B97X-V/def2-TZVP level.
  • Figure 5: Comparison of performance of GFN2-xTB, AIMNet2, eSEN-S, UMA-S, UMA-M, MACE-OMol25 for native and low-level-refined guess generation via the CI-NEB method on the Baker set. Performance is measured by successful convergence to the reference transition state, and the number of DFT gradient evaluations required. Italicized values denote failed runs, with superscripts denoting the failure mode.
  • ...and 1 more figures