Table of Contents
Fetching ...

Improving Predictions of Molecular Properties with Graph Featurisation and Heterogeneous Ensemble Models

Michael L. Parker, Samar Mahmoud, Bailey Montefiore, Mario Öeren, Himani Tandon, Charlotte Wharrick, Matthew D. Segall

TL;DR

The paper tackles the descriptor selection challenge in molecular property prediction by proposing a MetaModel that ensembles diverse ML models trained on a fusion of task-specific GNN descriptors and general RDKit descriptors. It introduces a ChemProp-based featurisation to derive MPNN descriptors and integrates them with fixed features in a heterogeneous model to boost predictive accuracy across MoleculeNet datasets. Results show that the MetaModel often outperforms a strong GNN baseline (ChemProp), particularly in regression, and gains further when incorporating GNN features on datasets where the baseline underperforms, while hyperparameter optimisation offers limited gains. The work demonstrates that combining learned representations with fixed descriptors and leveraging model diversity yields robust improvements, offering practical guidance for scalable molecular property prediction and avenues for future multi-target and advanced tuning research.

Abstract

We explore a "best-of-both" approach to modelling molecular properties by combining learned molecular descriptors from a graph neural network (GNN) with general-purpose descriptors and a mixed ensemble of machine learning (ML) models. We introduce a MetaModel framework to aggregate predictions from a diverse set of leading ML models. We present a featurisation scheme for combining task-specific GNN-derived features with conventional molecular descriptors. We demonstrate that our framework outperforms the cutting-edge ChemProp model on all regression datasets tested and 6 of 9 classification datasets. We further show that including the GNN features derived from ChemProp boosts the ensemble model's performance on several datasets where it otherwise would have underperformed. We conclude that to achieve optimal performance across a wide set of problems, it is vital to combine general-purpose descriptors with task-specific learned features and use a diverse set of ML models to make the predictions.

Improving Predictions of Molecular Properties with Graph Featurisation and Heterogeneous Ensemble Models

TL;DR

The paper tackles the descriptor selection challenge in molecular property prediction by proposing a MetaModel that ensembles diverse ML models trained on a fusion of task-specific GNN descriptors and general RDKit descriptors. It introduces a ChemProp-based featurisation to derive MPNN descriptors and integrates them with fixed features in a heterogeneous model to boost predictive accuracy across MoleculeNet datasets. Results show that the MetaModel often outperforms a strong GNN baseline (ChemProp), particularly in regression, and gains further when incorporating GNN features on datasets where the baseline underperforms, while hyperparameter optimisation offers limited gains. The work demonstrates that combining learned representations with fixed descriptors and leveraging model diversity yields robust improvements, offering practical guidance for scalable molecular property prediction and avenues for future multi-target and advanced tuning research.

Abstract

We explore a "best-of-both" approach to modelling molecular properties by combining learned molecular descriptors from a graph neural network (GNN) with general-purpose descriptors and a mixed ensemble of machine learning (ML) models. We introduce a MetaModel framework to aggregate predictions from a diverse set of leading ML models. We present a featurisation scheme for combining task-specific GNN-derived features with conventional molecular descriptors. We demonstrate that our framework outperforms the cutting-edge ChemProp model on all regression datasets tested and 6 of 9 classification datasets. We further show that including the GNN features derived from ChemProp boosts the ensemble model's performance on several datasets where it otherwise would have underperformed. We conclude that to achieve optimal performance across a wide set of problems, it is vital to combine general-purpose descriptors with task-specific learned features and use a diverse set of ML models to make the predictions.

Paper Structure

This paper contains 20 sections, 9 figures, 3 tables.

Figures (9)

  • Figure 1: The process used to featurise molecules using ChemProp. A) A ChemProp model, consisting of an MPNN that derives a latent molecular representation and an FFN head that makes predictions, is trained to predict molecular properties from a dataset of molecules and (optional) external descriptors. B) We calculate a new set of features (referred to as MPNN descriptors) by taking the output of the MPNN with the FFN head removed, after training in step A. C) These new descriptors are combined with (optional) external descriptors and used to train a MetaModel comprising diverse ML sub-models.
  • Figure 2: Mean relative performance of ChemProp and the MetaModel (with and without ChemProp features) for each regression dataset. The metric used in each case (as recommended in MoleculeNet) is shown in brackets next to the dataset name. In every case, lower is better. Where multiple targets are present in a dataset, we first calculate the relative metrics, then the geometric mean. We use the geometric mean as some datasets contain MAE/RMSE values for different targets that differ by several orders of magnitude. The top panel shows the absolute values, and the bottom panel shows them normalised to the ChemProp mean.
  • Figure 3: Model performance on the classification datasets, averaged over all targets for each dataset. The top panel shows the absolute values, and the bottom panel shows values normalised by the ChemProp AUC. In every case, metrics are shown in brackets, and higher is better. Where multiple targets are present, we calculate the arithmetic mean AUC value for each model separately, then calculate the ratios of the means. This avoids division by zero or very small AUC values skewing the result.
  • Figure 4: Performance of the MetaModel and ChemProp on the regression datasets with no external descriptors (lower is better). The top panel shows the absolute values, and the bottom panel shows the values normalised to those of the baseline ChemProp model (MPNN + RDKit). Metrics are shown in brackets, and aggregated as in Fig. \ref{['fig:reg_baseline']}.
  • Figure 5: MetaModel and ChemProp performance on the classification datasets with no external descriptors (higher is better, scores are normalised relative to those of the baseline ChemProp model (MPNN + RDKit) in the bottom panel). Metrics are shown in brackets, and aggregated as in Fig. \ref{['fig:class_baseline']}.
  • ...and 4 more figures