Table of Contents
Fetching ...

Uncertainty Quantification in Machine Learning for Engineering Design and Health Prognostics: A Tutorial

Venkat Nemani, Luca Biggio, Xun Huan, Zhen Hu, Olga Fink, Anh Tran, Yan Wang, Xiaoge Zhang, Chao Hu

TL;DR

This tutorial provides a comprehensive, practitioner-friendly framework for uncertainty quantification (UQ) in neural-network–driven engineering design and prognostics. It classifies uncertainty into aleatory and epistemic types, demonstrates how to decompose predictive uncertainty, and compares major UQ methods (GPR, Bayesian neural networks, ensembles, and deterministic approaches like SNGP) through toy and real-case studies. The work emphasizes calibration, OOD detection, and evaluation metrics (calibration curves, NLL, sparsification) and demonstrates two GitHub-backed case studies on battery RUL and turbofan engine prognostics to benchmark methods. Overall, it argues for principled, scalable UQ as essential for safety-critical deployment of ML in engineering, while outlining future directions for testbeds, decomposition, and theory-informed, efficiency-focused UQ methods.

Abstract

On top of machine learning models, uncertainty quantification (UQ) functions as an essential layer of safety assurance that could lead to more principled decision making by enabling sound risk assessment and management. The safety and reliability improvement of ML models empowered by UQ has the potential to significantly facilitate the broad adoption of ML solutions in high-stakes decision settings, such as healthcare, manufacturing, and aviation, to name a few. In this tutorial, we aim to provide a holistic lens on emerging UQ methods for ML models with a particular focus on neural networks and the applications of these UQ methods in tackling engineering design as well as prognostics and health management problems. Toward this goal, we start with a comprehensive classification of uncertainty types, sources, and causes pertaining to UQ of ML models. Next, we provide a tutorial-style description of several state-of-the-art UQ methods: Gaussian process regression, Bayesian neural network, neural network ensemble, and deterministic UQ methods focusing on spectral-normalized neural Gaussian process. Established upon the mathematical formulations, we subsequently examine the soundness of these UQ methods quantitatively and qualitatively (by a toy regression example) to examine their strengths and shortcomings from different dimensions. Then, we review quantitative metrics commonly used to assess the quality of predictive uncertainty in classification and regression problems. Afterward, we discuss the increasingly important role of UQ of ML models in solving challenging problems in engineering design and health prognostics. Two case studies with source codes available on GitHub are used to demonstrate these UQ methods and compare their performance in the life prediction of lithium-ion batteries at the early stage and the remaining useful life prediction of turbofan engines.

Uncertainty Quantification in Machine Learning for Engineering Design and Health Prognostics: A Tutorial

TL;DR

This tutorial provides a comprehensive, practitioner-friendly framework for uncertainty quantification (UQ) in neural-network–driven engineering design and prognostics. It classifies uncertainty into aleatory and epistemic types, demonstrates how to decompose predictive uncertainty, and compares major UQ methods (GPR, Bayesian neural networks, ensembles, and deterministic approaches like SNGP) through toy and real-case studies. The work emphasizes calibration, OOD detection, and evaluation metrics (calibration curves, NLL, sparsification) and demonstrates two GitHub-backed case studies on battery RUL and turbofan engine prognostics to benchmark methods. Overall, it argues for principled, scalable UQ as essential for safety-critical deployment of ML in engineering, while outlining future directions for testbeds, decomposition, and theory-informed, efficiency-focused UQ methods.

Abstract

On top of machine learning models, uncertainty quantification (UQ) functions as an essential layer of safety assurance that could lead to more principled decision making by enabling sound risk assessment and management. The safety and reliability improvement of ML models empowered by UQ has the potential to significantly facilitate the broad adoption of ML solutions in high-stakes decision settings, such as healthcare, manufacturing, and aviation, to name a few. In this tutorial, we aim to provide a holistic lens on emerging UQ methods for ML models with a particular focus on neural networks and the applications of these UQ methods in tackling engineering design as well as prognostics and health management problems. Toward this goal, we start with a comprehensive classification of uncertainty types, sources, and causes pertaining to UQ of ML models. Next, we provide a tutorial-style description of several state-of-the-art UQ methods: Gaussian process regression, Bayesian neural network, neural network ensemble, and deterministic UQ methods focusing on spectral-normalized neural Gaussian process. Established upon the mathematical formulations, we subsequently examine the soundness of these UQ methods quantitatively and qualitatively (by a toy regression example) to examine their strengths and shortcomings from different dimensions. Then, we review quantitative metrics commonly used to assess the quality of predictive uncertainty in classification and regression problems. Afterward, we discuss the increasingly important role of UQ of ML models in solving challenging problems in engineering design and health prognostics. Two case studies with source codes available on GitHub are used to demonstrate these UQ methods and compare their performance in the life prediction of lithium-ion batteries at the early stage and the remaining useful life prediction of turbofan engines.
Paper Structure (76 sections, 47 equations, 28 figures, 8 tables)

This paper contains 76 sections, 47 equations, 28 figures, 8 tables.

Figures (28)

  • Figure 1: Overview of the organization of the tutorial paper.
  • Figure 2: An example of uncertainty decomposition using variance-decomposition based method.
  • Figure 3: Types of uncertainty sources in ML models and the process of reducing epistemic uncertainty (i.e., methods (b).i and (a).i described in Sec. \ref{['sec:reduction']}).
  • Figure 4: Graphical comparison of six state-of-the-art UQ methods introduced in Sec. \ref{['sec:UQ_methods']}. These methods are GPR (method 1), BNN via MCMC or VI (method 2), BNN via MC dropout (method 3), neural network ensemble (method 4), DNN with GPR -- DNN-GPR (method 5), and SNGP (method 6). In method 1, MVN standards for the multivariate normal distribution, or equivalently, the multivariate Gaussian distribution used in the main text. In methods (5) and (6), SN stands for spectral normalization.
  • Figure 5: Sample functions drawn a Gaussian process prior (a) and posterior (b). The GPR model uses the squared exponential kernel with a length scale ($l$) of 1 and a signal amplitude ($\sigma_\mathrm{f}$) of 1, and a Gaussian observation model with a noise standard deviation ($\sigma_{\varepsilon}$) of 0.1. The means are shown collectively as a solid blue line/curve, and $\sim$95% confidence intervals (means plus and minus two standard deviations) are shown collectively as a light blue shaded area. 20 training observations are generated by corrupting a sine function with a white Gaussian noise term, $y = \mathrm{sin}(0.9x) + \varepsilon$ with $\varepsilon \sim \mathcal{N} \left(0, 0.1^2 \right)$; these observations are shown as red dots.
  • ...and 23 more figures