Table of Contents
Fetching ...

Variational Bayesian surrogate modelling with application to robust design optimisation

Thomas A. Archbold, Ieva Kazlauskaite, Fehmi Cirak

TL;DR

The accuracy and versatility of the proposed reduced dimension variational Gaussian process (RDVGP) surrogate on illustrative and robust structural optimisation problems where cost functions depend on a weighted sum of the mean and standard deviation of model outputs are demonstrated.

Abstract

Surrogate models provide a quick-to-evaluate approximation to complex computational models and are essential for multi-query problems like design optimisation. The inputs of current deterministic computational models are usually high-dimensional and uncertain. We consider Bayesian inference for constructing statistical surrogates with input uncertainties and intrinsic dimensionality reduction. The surrogate is trained by fitting to data obtained from a deterministic computational model. The assumed prior probability density of the surrogate is a Gaussian process. We determine the respective posterior probability density and parameters of the posited statistical model using variational Bayes. The non-Gaussian posterior is approximated by a Gaussian trial density with free variational parameters and the discrepancy between them is measured using the Kullback-Leibler (KL) divergence. We employ the stochastic gradient method to compute the variational parameters and other statistical model parameters by minimising the KL divergence. We demonstrate the accuracy and versatility of the proposed reduced dimension variational Gaussian process (RDVGP) surrogate on illustrative and robust structural optimisation problems where cost functions depend on a weighted sum of the mean and standard deviation of model outputs.

Variational Bayesian surrogate modelling with application to robust design optimisation

TL;DR

The accuracy and versatility of the proposed reduced dimension variational Gaussian process (RDVGP) surrogate on illustrative and robust structural optimisation problems where cost functions depend on a weighted sum of the mean and standard deviation of model outputs are demonstrated.

Abstract

Surrogate models provide a quick-to-evaluate approximation to complex computational models and are essential for multi-query problems like design optimisation. The inputs of current deterministic computational models are usually high-dimensional and uncertain. We consider Bayesian inference for constructing statistical surrogates with input uncertainties and intrinsic dimensionality reduction. The surrogate is trained by fitting to data obtained from a deterministic computational model. The assumed prior probability density of the surrogate is a Gaussian process. We determine the respective posterior probability density and parameters of the posited statistical model using variational Bayes. The non-Gaussian posterior is approximated by a Gaussian trial density with free variational parameters and the discrepancy between them is measured using the Kullback-Leibler (KL) divergence. We employ the stochastic gradient method to compute the variational parameters and other statistical model parameters by minimising the KL divergence. We demonstrate the accuracy and versatility of the proposed reduced dimension variational Gaussian process (RDVGP) surrogate on illustrative and robust structural optimisation problems where cost functions depend on a weighted sum of the mean and standard deviation of model outputs.
Paper Structure (16 sections, 59 equations, 14 figures, 3 tables, 1 algorithm)

This paper contains 16 sections, 59 equations, 14 figures, 3 tables, 1 algorithm.

Figures (14)

  • Figure 1: Comparison of standard GP and the proposed RDVGP surrogates for a function $f(s)= -0.5s\sin{(3\pi s^2)} + 0.25s$ and the normally distributed input variable with spatially dependent variance $p(s) = \mathcal{N}(\bar{s},\sigma_s^2)$; see also Section \ref{['subsection:illustrative_example_1']}. In the plots the lines indicate the mean and the shaded areas the $95\%$ confidence intervals. Shown is (a) the true probability density $p(f(\bar{s})) = \int p(f\vert \vec{s})p(\vec{s}) \D \vec{s}$ obtained by MC sampling, (b) the $n=5$ training data points comprising the training set $\set D$ and their spatially varying standard deviations, (c) the inferred standard GP posterior probability density $p( f(\bar{s}) \vert \set D)$, and (d) the inferred RDVGP posterior probability density $p( f(\bar{s}) \vert \set D)$. Comparing (a), (c), and (d) it is evident that the RDVGP posterior is much closer to the true probability density than the standard GP posterior.
  • Figure 2: Graphical model of a standard GP where $n$ is the size of the training data set $\mathcal{D}$, $\vec{s}$ is the vector of (deterministic) input variables, $f$ is the (random) target output variable and $y$ is the noisy observation. The model hyperparameters consist of the prior hyperparameters $\vec{\theta}$ and the noise standard deviation $\sigma_y$.
  • Figure 3: Graphical model of the RDVGP surrogate where $n$ is the size of the training data set $\mathcal{D}$, $\vec{s}$ is the random input variable vector, $\vec{z}$ is the low-dimensional unobserved latent variable vector, $f$ is the target output variable and $y$ is the noisy observation. The model hyperparameters consist of the entries of the orthogonal projection matrix $\vec{W}$, the prior hyperparameters $\vec{\theta}$ and the noise standard deviation $\sigma_y$.
  • Figure 4: Graphical model of the sparse RDVGP surrogate where $n$ is the size of the training data set $\mathcal{D}$, $\vec{s}$ is the random input variable vector, $\vec{z}$ is the low-dimensional unobserved latent variable vector, $f$ is the target output variable, $\tilde{f}$ is the pseudo output variable with $m$ realisations and $y$ is the noisy observation. The model hyperparameters consist of the entries of the orthogonal projection matrix $\vec{W}$, the prior hyperparameters $\vec{\theta}$ (which includes the pseudo latent variable $\tilde{\vec{z}}$) and the noise standard deviation $\sigma_y$.
  • Figure 5: Schematic of slice sampling showing (a) the posterior probability density expected value $\expect_{p_\Theta(f_* \vert \vec{y},\vec{s}_*)} (f_*)$ along slice $\vec{s}_* = (s_{d*}\,\,\bar{s}_f)^\trans$, and (b) the marginal posterior probability density $p_\Theta(f_* \vert \vec{y})$ over mean test input variable $\bar{s}_* = (\bar{s}_{d*}\,\,\bar{s}_f)^\trans$.
  • ...and 9 more figures