Table of Contents
Fetching ...

3DReact: Geometric deep learning for chemical reactions

Puck van Gerwen, Ksenia R. Briling, Charlotte Bunne, Vignesh Ram Somnath, Ruben Laplaza, Andreas Krause, Clemence Corminboeuf

TL;DR

3DReact introduces a geometry-aware, symmetry-respecting framework for predicting reaction barriers from 3D reactant and product structures, with optional atom-mapping information. By operating with invariant or equivariant molecular channels and offering both mapping-based and non-mapping variants, it achieves robust performance across chemically diverse datasets and extrapolation scenarios. The results show invariant models are often sufficient, while geometry-based variants provide advantages when mappings are unavailable or when angular geometry dominates. The approach also demonstrates resilience to lower-quality geometries and flexible integration of mapping information, making it a practical tool for reaction-property prediction across datasets.

Abstract

Geometric deep learning models, which incorporate the relevant molecular symmetries within the neural network architecture, have considerably improved the accuracy and data efficiency of predictions of molecular properties. Building on this success, we introduce 3DReact, a geometric deep learning model to predict reaction properties from three-dimensional structures of reactants and products. We demonstrate that the invariant version of the model is sufficient for existing reaction datasets. We illustrate its competitive performance on the prediction of activation barriers on the GDB7-22-TS, Cyclo-23-TS and Proparg-21-TS datasets in different atom-mapping regimes. We show that, compared to existing models for reaction property prediction, 3DReact offers a flexible framework that exploits atom-mapping information, if available, as well as geometries of reactants and products (in an invariant or equivariant fashion). Accordingly, it performs systematically well across different datasets, atom-mapping regimes, as well as both interpolation and extrapolation tasks.

3DReact: Geometric deep learning for chemical reactions

TL;DR

3DReact introduces a geometry-aware, symmetry-respecting framework for predicting reaction barriers from 3D reactant and product structures, with optional atom-mapping information. By operating with invariant or equivariant molecular channels and offering both mapping-based and non-mapping variants, it achieves robust performance across chemically diverse datasets and extrapolation scenarios. The results show invariant models are often sufficient, while geometry-based variants provide advantages when mappings are unavailable or when angular geometry dominates. The approach also demonstrates resilience to lower-quality geometries and flexible integration of mapping information, making it a practical tool for reaction-property prediction across datasets.

Abstract

Geometric deep learning models, which incorporate the relevant molecular symmetries within the neural network architecture, have considerably improved the accuracy and data efficiency of predictions of molecular properties. Building on this success, we introduce 3DReact, a geometric deep learning model to predict reaction properties from three-dimensional structures of reactants and products. We demonstrate that the invariant version of the model is sufficient for existing reaction datasets. We illustrate its competitive performance on the prediction of activation barriers on the GDB7-22-TS, Cyclo-23-TS and Proparg-21-TS datasets in different atom-mapping regimes. We show that, compared to existing models for reaction property prediction, 3DReact offers a flexible framework that exploits atom-mapping information, if available, as well as geometries of reactants and products (in an invariant or equivariant fashion). Accordingly, it performs systematically well across different datasets, atom-mapping regimes, as well as both interpolation and extrapolation tasks.
Paper Structure (19 sections, 8 figures, 2 tables)

This paper contains 19 sections, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Architecture of 3DReact. Molecules pass through independent symmetry-adapted (invariant or equivariant) channels (green and orange). These are combined to yield a reaction representation (blue) which is used to predict a reaction property, such as the activation energy (red dot).
  • Figure 2: Scheme illustrating how the reactant (green) and product (orange) representations are combined to form a reaction representation (blue) and eventually predict the target property (red dot) using a multilayer perceptron (mlp). $\sum$ refers to the summation over atom-wise environments. Oblong rectangles and squares represent vectors and scalars, respectively.
  • Figure 3: Learning curves for InReact and EquiReact in the "True" atom-mapping regime. Each point shows mean absolute error (MAE), averaged over 10 folds of 80/10/10 splits (for training set fraction $< 0.8$, the corresponding subset of the "full" training set is used), and error bars indicate standard deviations across folds.
  • Figure 4: Top: Reactant and product of a toy reaction: two homometric structures (a) and (b) with atom labels, atom coordinates (Å), and interatomic distances (Å). "Bonds" of the same length are of the same color. Bottom: output after 5 epochs of the invariant (c,d,e) / equivariant (f,g) molecular channels for each atom with different radial cutoffs $r_{\max}$ and number of convolutional layers $n_\mathrm{conv}$. Within each subfigure, atomic representations indistinguishable up to shown digits are marked by the same color.
  • Figure 5: Mean absolute errors (MAEs) of predictions using three different extrapolation splits: scaffold, size-, and property-based. All datasets are compared in three atom-mapping regimes: "True", "RXNMapper" (RXNM), and "None", except for the Proparg-21-TS set, where RXNMapper cannot map the reaction SMILES. MAEs are averaged over 10 folds of 80/10/10 splits (training/validation/test), and error bars indicate standard deviations across folds, where applicable.
  • ...and 3 more figures