Table of Contents
Fetching ...

Graph neural networks for the prediction of molecular structure-property relationships

Jan G. Rittig, Qinghe Gao, Manuel Dahmen, Alexander Mitsos, Artur M. Schweidtmann

TL;DR

This paper surveys graph neural networks (GNNs) for predicting molecular structure–property relationships, arguing that end-to-end learning directly from molecular graphs can outperform traditional QSPR/QSAR descriptor-based approaches. It explains the core GNN framework, including molecular graphs, message passing, and readout/pooling, and demonstrates two case studies: regression of normal boiling points and binary biodegradability classification, using edge-conditioned graph convolutions and ensemble strategies. The results show high predictive accuracy (e.g., MAE reductions with ensembles and AUROC > 0.9 for biodegradability) and illustrate how GNNs can learn informative representations without predefined descriptors. The work underscores the potential of GNNs for rapid, scalable in silico screening and design in chemistry and chemical engineering, while pointing to future directions in uncertainty quantification, interpretability, and integrating physical knowledge into the models.

Abstract

Molecular property prediction is of crucial importance in many disciplines such as drug discovery, molecular biology, or material and process design. The frequently employed quantitative structure-property/activity relationships (QSPRs/QSARs) characterize molecules by descriptors which are then mapped to the properties of interest via a linear or nonlinear model. In contrast, graph neural networks, a novel machine learning method, directly work on the molecular graph, i.e., a graph representation where atoms correspond to nodes and bonds correspond to edges. GNNs allow to learn properties in an end-to-end fashion, thereby avoiding the need for informative descriptors as in QSPRs/QSARs. GNNs have been shown to achieve state-of-the-art prediction performance on various property predictions tasks and represent an active field of research. We describe the fundamentals of GNNs and demonstrate the application of GNNs via two examples for molecular property prediction.

Graph neural networks for the prediction of molecular structure-property relationships

TL;DR

This paper surveys graph neural networks (GNNs) for predicting molecular structure–property relationships, arguing that end-to-end learning directly from molecular graphs can outperform traditional QSPR/QSAR descriptor-based approaches. It explains the core GNN framework, including molecular graphs, message passing, and readout/pooling, and demonstrates two case studies: regression of normal boiling points and binary biodegradability classification, using edge-conditioned graph convolutions and ensemble strategies. The results show high predictive accuracy (e.g., MAE reductions with ensembles and AUROC > 0.9 for biodegradability) and illustrate how GNNs can learn informative representations without predefined descriptors. The work underscores the potential of GNNs for rapid, scalable in silico screening and design in chemistry and chemical engineering, while pointing to future directions in uncertainty quantification, interpretability, and integrating physical knowledge into the models.

Abstract

Molecular property prediction is of crucial importance in many disciplines such as drug discovery, molecular biology, or material and process design. The frequently employed quantitative structure-property/activity relationships (QSPRs/QSARs) characterize molecules by descriptors which are then mapped to the properties of interest via a linear or nonlinear model. In contrast, graph neural networks, a novel machine learning method, directly work on the molecular graph, i.e., a graph representation where atoms correspond to nodes and bonds correspond to edges. GNNs allow to learn properties in an end-to-end fashion, thereby avoiding the need for informative descriptors as in QSPRs/QSARs. GNNs have been shown to achieve state-of-the-art prediction performance on various property predictions tasks and represent an active field of research. We describe the fundamentals of GNNs and demonstrate the application of GNNs via two examples for molecular property prediction.
Paper Structure (21 sections, 11 equations, 7 figures, 4 tables)

This paper contains 21 sections, 11 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Neighborhood aggregation scheme for one atom (bold C atom at lower left). $R$ indicates the radius, i.e., the size of the neighborhood. Illustrative example for 4-methyl-2-pentanone.
  • Figure 2: Overview of a GNN model for property prediction, similar to the GNN we used for predicting fuel ignition qualities of hydrocarbons, cf. Schweidtmann2020_GNNs.
  • Figure 3: Generation of an attributed molecular graph, illustrative example for 4-methyl-2-pentanone.
  • Figure 4: Information exchange between nodes within an edge-conditioned graph convolutional layer in the message passing phase of a graph neural network, illustrated update step for node #2, adapted from our publication Schweidtmann2020_GNNs.
  • Figure 5: Pooling step in a graph neural network, illustrative example for 4-methyl-2-pentanone.
  • ...and 2 more figures