Table of Contents
Fetching ...

BLIPs: Bayesian Learned Interatomic Potentials

Dario Coscia, Pim de Haan, Max Welling

TL;DR

BLIP addresses the scarcity of principled uncertainty quantification in MLIPs by introducing a scalable variational Bayesian framework that injects input-dependent stochasticity into MPNN-based interatomic potentials. Through an adaptive dropout scheme governed by a lightweight inference network, BLIP yields well-calibrated uncertainty estimates while maintaining inference efficiency comparable to deterministic models. Empirically, BLIP improves predictive accuracy and uncertainty calibration across data-scarce, out-of-distribution, and large-scale fine-tuning tasks, and enhances active learning effectiveness when selecting informative structures. The approach is architecture-agnostic, integrates with equivariant/invariant MLIPs, and offers a practical drop-in tool for uncertainty-aware atomistic simulations and materials discovery.

Abstract

Machine Learning Interatomic Potentials (MLIPs) are becoming a central tool in simulation-based chemistry. However, like most deep learning models, MLIPs struggle to make accurate predictions on out-of-distribution data or when trained in a data-scarce regime, both common scenarios in simulation-based chemistry. Moreover, MLIPs do not provide uncertainty estimates by construction, which are fundamental to guide active learning pipelines and to ensure the accuracy of simulation results compared to quantum calculations. To address this shortcoming, we propose BLIPs: Bayesian Learned Interatomic Potentials. BLIP is a scalable, architecture-agnostic variational Bayesian framework for training or fine-tuning MLIPs, built on an adaptive version of Variational Dropout. BLIP delivers well-calibrated uncertainty estimates and minimal computational overhead for energy and forces prediction at inference time, while integrating seamlessly with (equivariant) message-passing architectures. Empirical results on simulation-based computational chemistry tasks demonstrate improved predictive accuracy with respect to standard MLIPs, and trustworthy uncertainty estimates, especially in data-scarse or heavy out-of-distribution regimes. Moreover, fine-tuning pretrained MLIPs with BLIP yields consistent performance gains and calibrated uncertainties.

BLIPs: Bayesian Learned Interatomic Potentials

TL;DR

BLIP addresses the scarcity of principled uncertainty quantification in MLIPs by introducing a scalable variational Bayesian framework that injects input-dependent stochasticity into MPNN-based interatomic potentials. Through an adaptive dropout scheme governed by a lightweight inference network, BLIP yields well-calibrated uncertainty estimates while maintaining inference efficiency comparable to deterministic models. Empirically, BLIP improves predictive accuracy and uncertainty calibration across data-scarce, out-of-distribution, and large-scale fine-tuning tasks, and enhances active learning effectiveness when selecting informative structures. The approach is architecture-agnostic, integrates with equivariant/invariant MLIPs, and offers a practical drop-in tool for uncertainty-aware atomistic simulations and materials discovery.

Abstract

Machine Learning Interatomic Potentials (MLIPs) are becoming a central tool in simulation-based chemistry. However, like most deep learning models, MLIPs struggle to make accurate predictions on out-of-distribution data or when trained in a data-scarce regime, both common scenarios in simulation-based chemistry. Moreover, MLIPs do not provide uncertainty estimates by construction, which are fundamental to guide active learning pipelines and to ensure the accuracy of simulation results compared to quantum calculations. To address this shortcoming, we propose BLIPs: Bayesian Learned Interatomic Potentials. BLIP is a scalable, architecture-agnostic variational Bayesian framework for training or fine-tuning MLIPs, built on an adaptive version of Variational Dropout. BLIP delivers well-calibrated uncertainty estimates and minimal computational overhead for energy and forces prediction at inference time, while integrating seamlessly with (equivariant) message-passing architectures. Empirical results on simulation-based computational chemistry tasks demonstrate improved predictive accuracy with respect to standard MLIPs, and trustworthy uncertainty estimates, especially in data-scarse or heavy out-of-distribution regimes. Moreover, fine-tuning pretrained MLIPs with BLIP yields consistent performance gains and calibrated uncertainties.

Paper Structure

This paper contains 57 sections, 2 theorems, 37 equations, 4 figures, 6 tables, 4 algorithms.

Key Result

Proposition 1

The stochastic map $S$ is $G$-equivariant: where $g_*:\mathcal{P}(Y)\to\mathcal{P}(Y)$ is the pushforward by the action $y\mapsto g\cdot y$.

Figures (4)

  • Figure 1: Bayesian Learned Interatomic Potentials (BLIPs) for Prediction and Uncertainty Quantification. An atomic structure is encoded as a graph, with initial node features $\bm{h}^0$ and edge features $a_{ij}$, and processed by a standard Message Passing Neural Network (MPNN). BLIP introduces stochasticity into the machine learned interatomic potential (MLIP) by injecting zero-mean Gaussian noise into the message and update functions. Specifically, the deterministic weights $\bm{\theta}^M_l$ and $\bm{\theta}^U_l$ are perturbed using perturbation scales $\alpha_l$ and $\beta_l$ (obtained through an inference network), respectively, yielding stochastic weights $\bm{\omega}^M_l$ and $\bm{\omega}^U_l$. The main weights $\bm{\theta}^M_l$ and $\bm{\theta}^U_l$, and the inference network weights are trained jointly using variational inference. This transforms MLIP into BLIP, a probabilistic model, enabling principled uncertainty quantification and improved predictive accuracy.
  • Figure 2: Expected calibration error for Forces in the Ammonia system for OOD data. Averages are computed over 4 random initialisation seeds. Baseline models' results are from tan2023single.
  • Figure 3: (a) Visualisation of the silica glass structure. (b) Forces mean absolute error (MAE) in the Silica system across different training sizes for finetuned ORB v3 direct 20OMAT. Averages are computed over 4 random initialisation seeds, and the standard error of the mean is reported as an error bar. The best baseline model is a Deep Ensemble Equivariant PaiNN from tan2023single trained on the $1024$ configuration (same split), achieving MAE $6.58 \text{ kcal/mol/Å}$.
  • Figure 4: Validation loss for different models during each active learning step, calculated as the sum of mean absolute errors (MAE) for energy and forces. Results are averaged over 3 independent training runs. Energy is reported in eV and forces in eV/Å.

Theorems & Definitions (3)

  • Proposition 1
  • proof
  • Corollary 1