Table of Contents
Fetching ...

Outlier-Detection for Reactive Machine Learned Potential Energy Surfaces

Luis Itza Vazquez-Salazar, Silvan Käser, Markus Meuwly

TL;DR

A structure-based indicator was found to be correlated with large average error, which may help to rapidly classify new structures into those that provide an advantage for refining the neural network.

Abstract

Uncertainty quantification (UQ) to detect samples with large expected errors (outliers) is applied to reactive molecular potential energy surfaces (PESs). Three methods - Ensembles, Deep Evidential Regression (DER), and Gaussian Mixture Models (GMM) - were applied to the H-transfer reaction between ${\it syn-}$Criegee and vinyl hydroxyperoxide. The results indicate that ensemble models provide the best results for detecting outliers, followed by GMM. For example, from a pool of 1000 structures with the largest uncertainty, the detection quality for outliers is $\sim 90$ \% and $\sim 50$ \%, respectively, if 25 or 1000 structures with large errors are sought. On the contrary, the limitations of the statistical assumptions of DER greatly impacted its prediction capabilities. Finally, a structure-based indicator was found to be correlated with large average error, which may help to rapidly classify new structures into those that provide an advantage for refining the neural network.

Outlier-Detection for Reactive Machine Learned Potential Energy Surfaces

TL;DR

A structure-based indicator was found to be correlated with large average error, which may help to rapidly classify new structures into those that provide an advantage for refining the neural network.

Abstract

Uncertainty quantification (UQ) to detect samples with large expected errors (outliers) is applied to reactive molecular potential energy surfaces (PESs). Three methods - Ensembles, Deep Evidential Regression (DER), and Gaussian Mixture Models (GMM) - were applied to the H-transfer reaction between Criegee and vinyl hydroxyperoxide. The results indicate that ensemble models provide the best results for detecting outliers, followed by GMM. For example, from a pool of 1000 structures with the largest uncertainty, the detection quality for outliers is \% and \%, respectively, if 25 or 1000 structures with large errors are sought. On the contrary, the limitations of the statistical assumptions of DER greatly impacted its prediction capabilities. Finally, a structure-based indicator was found to be correlated with large average error, which may help to rapidly classify new structures into those that provide an advantage for refining the neural network.
Paper Structure (30 sections, 23 equations, 31 figures, 6 tables)

This paper contains 30 sections, 23 equations, 31 figures, 6 tables.

Figures (31)

  • Figure 1: Characteristics of the stationary points of the PESs. The energy of the VHP minimum serves as a reference. The energy scale is exaggerated to better represent the differences between the methods.
  • Figure 2: Behaviour of the different models during simulation. Panel A shows the Minimum energy path (MEP) from syn-Criegee to VHP for the different methods for UQ used in this work. The zero of energy is the corresponding value for the optimized structure of VHP. Panel B shows the energy distribution for the different models during the simulation. Note that the $x$-axis is on a logarithmic scale. Starting from (syn)-Criegee, the system was simulated for 500 ps with a time step of 0.1 fs. The inset shows the time series of the energy for DER-M. Panel C shows the variation of the energy for the Minimum Dynamic Path (MDP) of the different formulations of the ML-PESs starting from the optimized TS. Panel D reports the time series of the reaction coordinate ($q=r_{\rm CH}-r_{\rm OH}$) from the MDP.
  • Figure 3: Performance of the Ens-3 and Ens-6 on the test set. Panels A and B on the left show residual plots of the error between reference and prediction. The 1000 energies with the largest variance are shaded with a different colour and directly reflect the model's capability to detect outliers. The corresponding colour bar represents the scale of the variance. Squared error distribution (solid lines) and variance distributions (dotted lines) are shown in the centre next to panels A and B for comparison. Complementary to this is the variance distribution shown on the right of both panes. Notice that the $x$-axis on the centre and right are in logarithmic scale.
  • Figure 4: Performance of the different versions of PhysNet-DER through the range of energies of the test set. Panels A to C on the left show residual plots of the error between reference and inference for DER-S, DER-L, and DER-M, respectively. The 1000 points with the largest variance are shaded with a different colour (red, magenta, and yellow from top to bottom) and directly reflect the model's capability to detect outliers. The corresponding colour bar represents the scale of the values. Squared error distribution (solid lines) and variance distributions (dotted lines) are shown in the centre next to panels A, B, and C for comparison. Complementary to this is the variance distribution shown on the right of both panels. Notice that the $x$-axis on the centre and right are in logarithmic scale.
  • Figure 5: Performance of the PhysNet-GMM through the range of energies of the test set. A Residual plot of the error between reference and production is shown on the left. The 1000 points with the largest negative log-likelihood (NLL) value are shaded with a different colour and directly reflect the model's capability to detect outliers. The corresponding colour bar represents the scale of the values. The panel in the centre shows the squared error distribution. Note that the $x$-axis of the centre panel is in logarithmic scale for clarity. The panel on the right displays the distribution of the NLL, which is used to quantify the uncertainty.
  • ...and 26 more figures