Table of Contents
Fetching ...

Probabilistic Neural Networks (PNNs) for Modeling Aleatoric Uncertainty in Scientific Machine Learning

Farhad Pourkamali-Anaraki, Jamal F. Husseini, Scott E. Stapleton

TL;DR

The work addresses modeling aleatoric uncertainty in scientific machine learning by replacing the deterministic output with a trainable Gaussian distribution characterized by a mean $f_ extmu$ and variance $f_ extσ$, optimized via the negative log-likelihood. It introduces KL-divergence as a principled metric for neural-architecture search, benchmarking PNNs against Gaussian process regression and demonstrating superior handling of heteroscedastic data in both synthetic tests and a real materials-science case. Key findings show that PNNs can achieve mean predictions with $R^2$ around $0.97$ and predictive intervals with high observed correlation (≈ 0.80), while GPR struggles with heteroscedastic uncertainty in these settings. The approach provides robust, distribution-aware surrogate modeling for scientific problems and suggests avenues for integrating epistemic uncertainty and active learning in future work.

Abstract

This paper investigates the use of probabilistic neural networks (PNNs) to model aleatoric uncertainty, which refers to the inherent variability in the input-output relationships of a system, often characterized by unequal variance or heteroscedasticity. Unlike traditional neural networks that produce deterministic outputs, PNNs generate probability distributions for the target variable, allowing the determination of both predicted means and intervals in regression scenarios. Contributions of this paper include the development of a probabilistic distance metric to optimize PNN architecture, and the deployment of PNNs in controlled data sets as well as a practical material science case involving fiber-reinforced composites. The findings confirm that PNNs effectively model aleatoric uncertainty, proving to be more appropriate than the commonly employed Gaussian process regression for this purpose. Specifically, in a real-world scientific machine learning context, PNNs yield remarkably accurate output mean estimates with R-squared scores approaching 0.97, and their predicted intervals exhibit a high correlation coefficient of nearly 0.80, closely matching observed data intervals. Hence, this research contributes to the ongoing exploration of leveraging the sophisticated representational capacity of neural networks to delineate complex input-output relationships in scientific problems.

Probabilistic Neural Networks (PNNs) for Modeling Aleatoric Uncertainty in Scientific Machine Learning

TL;DR

The work addresses modeling aleatoric uncertainty in scientific machine learning by replacing the deterministic output with a trainable Gaussian distribution characterized by a mean and variance , optimized via the negative log-likelihood. It introduces KL-divergence as a principled metric for neural-architecture search, benchmarking PNNs against Gaussian process regression and demonstrating superior handling of heteroscedastic data in both synthetic tests and a real materials-science case. Key findings show that PNNs can achieve mean predictions with around and predictive intervals with high observed correlation (≈ 0.80), while GPR struggles with heteroscedastic uncertainty in these settings. The approach provides robust, distribution-aware surrogate modeling for scientific problems and suggests avenues for integrating epistemic uncertainty and active learning in future work.

Abstract

This paper investigates the use of probabilistic neural networks (PNNs) to model aleatoric uncertainty, which refers to the inherent variability in the input-output relationships of a system, often characterized by unequal variance or heteroscedasticity. Unlike traditional neural networks that produce deterministic outputs, PNNs generate probability distributions for the target variable, allowing the determination of both predicted means and intervals in regression scenarios. Contributions of this paper include the development of a probabilistic distance metric to optimize PNN architecture, and the deployment of PNNs in controlled data sets as well as a practical material science case involving fiber-reinforced composites. The findings confirm that PNNs effectively model aleatoric uncertainty, proving to be more appropriate than the commonly employed Gaussian process regression for this purpose. Specifically, in a real-world scientific machine learning context, PNNs yield remarkably accurate output mean estimates with R-squared scores approaching 0.97, and their predicted intervals exhibit a high correlation coefficient of nearly 0.80, closely matching observed data intervals. Hence, this research contributes to the ongoing exploration of leveraging the sophisticated representational capacity of neural networks to delineate complex input-output relationships in scientific problems.
Paper Structure (10 sections, 15 equations, 9 figures)

This paper contains 10 sections, 15 equations, 9 figures.

Figures (9)

  • Figure 1: The inclusion of the probabilistic output layer transforms the neural network from making deterministic predictions to providing a normal distribution characterized by its mean $f_{\mu}(\mathbf{x};\boldsymbol{\theta})$ and variance $f_{\sigma}(\mathbf{x};\boldsymbol{\theta})$. By providing a distribution of possible outcomes rather than a single value, PNNs enable the quantification of heteroscedastic aleatoric uncertainty.
  • Figure 2: Demonstrating the critical role of hyperparameter tuning in PNNs through the application of the proposed KL divergence metric. This controlled case study explores $4$ varying depths for the number of hidden layers, ranging from $1$ to $4$, alongside $4$ distinct widths, represented by the number of hidden units within the set $\{2, 4, 6, 8\}$. The outcomes of the grid search are showcased in (a), while (b) and (c) respectively highlight the least effective and optimal PNN models identified through this process. Note that in (a), lower KL divergence values are favored as they indicate a closer match between the actual and predicted output distributions.
  • Figure 3: Comparing the predicted mean with the actual or empirical mean in (a), alongside the predicted intervals against the actual intervals in (b) using the optimized PNN on the synthetic data set. The PNN delivers highly accurate predictions of the empirical mean while generating intervals that are marginally wider than those observed empirically. Note that for both metrics, the highest possible score is $1$.
  • Figure 4: Using GPR to model the heteroscedastic aleatoric noise in the synthetic data set, we display the KL divergence scores for different settings of the two hyperparameters in (a). Additionally, we evaluate the top-performing GPR model's accuracy in predicting mean and variance, as shown in (b) and (c). The results clearly indicate that GPR struggles to adequately represent the aleatoric uncertainty inherent in this controlled data setting.
  • Figure 5: Utilizing PNNs to model the aleatoric uncertainty associated with the Ishigami function, we report the KL divergence values between the actual and estimated output distributions in (a). Additionally, we examine the predicted mean and interval accuracies in (b) and (c). The results indicate that the optimized PNN is effective in generating an accurate predictive distribution for the output.
  • ...and 4 more figures