Uncertainty Quantification in Graph Neural Networks with Shallow Ensembles
Tirtha Vinchurkar, Kareem Abdelmaqsoud, John R. Kitchin
TL;DR
This work addresses the challenge of unreliable GNN predictions for out-of-domain data in materials modeling by introducing Direct Propagation of Shallow Ensembles (DPOSE) integrated with SchNet. DPOSE uses lightweight, weight-sharing shallow ensembles to deliver both predictive means and uncertainties without the heavy cost of deep ensembles, trained via a Negative Log-Likelihood objective. Evaluations on QM9, OC20, and Gold MD datasets demonstrate that DPOSE can distinguish in-domain from out-of-domain configurations and captures uncertainty trends related to molecular size, distortions, and compositional changes, with dataset-specific strengths and limitations. The approach offers a scalable path toward robust uncertainty-aware materials discovery and can be integrated with active learning to guide efficient exploration of design spaces.
Abstract
Machine-learned potentials (MLPs) have revolutionized materials discovery by providing accurate and efficient predictions of molecular and material properties. Graph Neural Networks (GNNs) have emerged as a state-of-the-art approach due to their ability to capture complex atomic interactions. However, GNNs often produce unreliable predictions when encountering out-of-domain data and it is difficult to identify when that happens. To address this challenge, we explore Uncertainty Quantification (UQ) techniques, focusing on Direct Propagation of Shallow Ensembles (DPOSE) as a computationally efficient alternative to deep ensembles. By integrating DPOSE into the SchNet model, we assess its ability to provide reliable uncertainty estimates across diverse Density Functional Theory datasets, including QM9, OC20, and Gold Molecular Dynamics. Our findings often demonstrate that DPOSE successfully distinguishes between in-domain and out-of-domain samples, exhibiting higher uncertainty for unobserved molecule and material classes. This work highlights the potential of lightweight UQ methods in improving the robustness of GNN-based materials modeling and lays the foundation for future integration with active learning strategies.
