Improved Uncertainty Estimation of Graph Neural Network Potentials Using Engineered Latent Space Distances

Joseph Musielewicz; Janice Lan; Matt Uyttendaele; John R. Kitchin

Improved Uncertainty Estimation of Graph Neural Network Potentials Using Engineered Latent Space Distances

Joseph Musielewicz, Janice Lan, Matt Uyttendaele, John R. Kitchin

TL;DR

This work tackles uncertainty estimation for graph neural network predictions of relaxed energies in catalytic materials, where relaxations induce non-Gaussian error distributions. It benchmarks four UQ approaches—ensembles, latent-space distances, mean-variance estimation, and sequence regression—using distribution-free calibration metrics such as $CI(Var(Z))$ and error-based calibration plots, with recalibration on a calibration set and evaluation on a test set. The results show latent-distance methods, especially with per-atom, invariant latent representations, provide the best local calibration and robust cross-domain performance, outperforming ensembles and residual models. The study demonstrates a practical path to trustworthy high-throughput screening with AdsorbML and Open Catalyst Project, offering interpretable examples and a recalibration protocol to guide future development.

Abstract

Graph neural networks (GNNs) have been shown to be astonishingly capable models for molecular property prediction, particularly as surrogates for expensive density functional theory calculations of relaxed energy for novel material discovery. However, one limitation of GNNs in this context is the lack of useful uncertainty prediction methods, as this is critical to the material discovery pipeline. In this work, we show that uncertainty quantification for relaxed energy calculations is more complex than uncertainty quantification for other kinds of molecular property prediction, due to the effect that structure optimizations have on the error distribution. We propose that distribution-free techniques are more useful tools for assessing calibration, recalibrating, and developing uncertainty prediction methods for GNNs performing relaxed energy calculations. We also develop a relaxed energy task for evaluating uncertainty methods for equivariant GNNs, based on distribution-free recalibration and using the Open Catalyst Project dataset. We benchmark a set of popular uncertainty prediction methods on this task, and show that latent distance methods, with our novel improvements, are the most well-calibrated and economical approach for relaxed energy calculations. Finally, we demonstrate that our latent space distance method produces results which align with our expectations on a clustering example, and on specific equation of state and adsorbate coverage examples from outside the training dataset.

Improved Uncertainty Estimation of Graph Neural Network Potentials Using Engineered Latent Space Distances

TL;DR

and error-based calibration plots, with recalibration on a calibration set and evaluation on a test set. The results show latent-distance methods, especially with per-atom, invariant latent representations, provide the best local calibration and robust cross-domain performance, outperforming ensembles and residual models. The study demonstrates a practical path to trustworthy high-throughput screening with AdsorbML and Open Catalyst Project, offering interpretable examples and a recalibration protocol to guide future development.

Abstract

Paper Structure (16 sections, 2 equations, 7 figures, 3 tables)

This paper contains 16 sections, 2 equations, 7 figures, 3 tables.

Introduction
Background
AdsorbML
Graph Neural Networks
Uncertainty Quantification
Methods
Uncertainty Quantification Methods for GNNs
Uncertainty Validation Metrics
Recalibration and Evaluation
Results
Error Distribution and Uncertainty Validation Metrics
Benchmarking Uncertainty Quantification
Comparing Distance Methods
Interpretable Examples
Conclusion
...and 1 more sections

Figures (7)

Figure 1: Diagram of the components of the latent distance uncertainty prediction method. It details the setup procedure and the inference procedure. The setup procedure must be performed once per model training set. The inference procedure is first used on the calibration set during the setup procedure, and then every time inference is performed afterwards to predict the model uncertainty. Note that the calibration steps at the end of the setup procedure and inference procedure can be applied to any of the uncertainty prediction methods.
Figure 2: Violin and parity plot for a latent distance uncertainty prediction. After calibrating uncertainty predictions with error based calibration, they can be interpreted as predictions of the dispersion of the residual for that inference calculation. In \ref{['fig:violin']} we see the measured error distributions for five uncertainty prediction bins. In \ref{['fig:contour']} we see contour lines tracing five different quantiles of measured error, rolling along the predicted uncertainty.
Figure 3: Calibration plots for the best performing iterations of the four categories of methods discussed in this work. The diagram above each plot shows how each method estimates the variance of the relaxed energy trajectory predicted by the graph neural network.
Figure 4: Contour parity plots for the best two method classes, predicting on the test set after recalibration on the calibration set. Uncertainties predicted by the best performing ensemble method (\ref{['fig:best_ens_contour']}) and the best performing latent distance method (\ref{['fig:best_dis_contour']}). The quantile curves of the latent distance method are generally montonic, which aligns with the better local calibration performance. The ensemble method uncertainties are shifted right by recalibration, and shows a significant drop in the uncertainty estimates after 0.2 eV, and an abrupt increase after 0.3 eV.
Figure 5: Plots of UMAP dimensionality reduction performed on the equivariant (all channels) and invariant (m=0, l=0) latent space representations sampled from Equiformer V2 without edge alignment. Latent space representations were sampled for a subset of the training set containing specific elements. We see that the different elements represented are clearly clustered in both plots, but that there is significantly more noise in the clustering of similar elements in the equivariant latent space, while the invariant latent space clusters are much denser and less noisy.
...and 2 more figures

Improved Uncertainty Estimation of Graph Neural Network Potentials Using Engineered Latent Space Distances

TL;DR

Abstract

Improved Uncertainty Estimation of Graph Neural Network Potentials Using Engineered Latent Space Distances

Authors

TL;DR

Abstract

Table of Contents

Figures (7)