Table of Contents
Fetching ...

Algorithms and Scientific Software for Quasi-Monte Carlo, Fast Gaussian Process Regression, and Scientific Machine Learning

Aleksei G. Sorokin

TL;DR

This thesis unifies developments in three broad domains: Quasi-Monte Carlo (QMC) methods for efficient high-dimensional integration, Gaussian process (GP) regression for high-dimensional interpolation with built-in uncertainty quantification, and scientific machine learning (sciML) for modeling partial differential equations (PDEs) with mesh-free solvers.

Abstract

Most scientific domains elicit the development of efficient algorithms and accessible scientific software. This thesis unifies our developments in three broad domains: Quasi-Monte Carlo (QMC) methods for efficient high-dimensional integration, Gaussian process (GP) regression for high-dimensional interpolation with built-in uncertainty quantification, and scientific machine learning (sciML) for modeling partial differential equations (PDEs) with mesh-free solvers. For QMC, we built new algorithms for vectorized error estimation and developed QMCPy (https://qmcsoftware.github.io/QMCSoftware/): an open-source Python interface to randomized low-discrepancy sequence generators, automatic variable transforms, adaptive error estimation procedures, and diverse use cases. For GPs, we derived new digitally-shift-invariant kernels of higher-order smoothness, developed novel fast multitask GP algorithms, and produced the scalable Python software FastGPs (https://alegresor.github.io/fastgps/). For sciML, we developed a new algorithm capable of machine precision recovery of PDEs with random coefficients. We have also studied a number of applications including GPs for probability of failure estimation, multilevel GPs for the Darcy flow equation, neural surrogates for modeling radiative transfer, and fast GPs for Bayesian multilevel QMC.

Algorithms and Scientific Software for Quasi-Monte Carlo, Fast Gaussian Process Regression, and Scientific Machine Learning

TL;DR

This thesis unifies developments in three broad domains: Quasi-Monte Carlo (QMC) methods for efficient high-dimensional integration, Gaussian process (GP) regression for high-dimensional interpolation with built-in uncertainty quantification, and scientific machine learning (sciML) for modeling partial differential equations (PDEs) with mesh-free solvers.

Abstract

Most scientific domains elicit the development of efficient algorithms and accessible scientific software. This thesis unifies our developments in three broad domains: Quasi-Monte Carlo (QMC) methods for efficient high-dimensional integration, Gaussian process (GP) regression for high-dimensional interpolation with built-in uncertainty quantification, and scientific machine learning (sciML) for modeling partial differential equations (PDEs) with mesh-free solvers. For QMC, we built new algorithms for vectorized error estimation and developed QMCPy (https://qmcsoftware.github.io/QMCSoftware/): an open-source Python interface to randomized low-discrepancy sequence generators, automatic variable transforms, adaptive error estimation procedures, and diverse use cases. For GPs, we derived new digitally-shift-invariant kernels of higher-order smoothness, developed novel fast multitask GP algorithms, and produced the scalable Python software FastGPs (https://alegresor.github.io/fastgps/). For sciML, we developed a new algorithm capable of machine precision recovery of PDEs with random coefficients. We have also studied a number of applications including GPs for probability of failure estimation, multilevel GPs for the Darcy flow equation, neural surrogates for modeling radiative transfer, and fast GPs for Bayesian multilevel QMC.

Paper Structure

This paper contains 62 theorems, 403 equations, 47 figures, 13 tables, 7 algorithms.

Key Result

Theorem 4.7.1

Suppose that $h_\varepsilon$ satisfies the metric map condition Then error criterion SoRa_eq:sc_raw holds if and only if Furthermore, the choice of minimizes $\sup_{s \in [s^-,s^+]} \lvert s - \hat{s} \rvert -h_\varepsilon(s)$ for any choice of $s^{\pm}$ with $s^- < s^+$.

Figures (47)

  • Figure 1.1: IID (independent identically distributed) and LD (low-discrepancy) point sets. IID points have gaps and clusters while the dependent LD points fill space more evenly. Quasi-Monte Carlo methods use LD points for high-dimensional integral approximation and typically converge faster than IID Monte Carlo methods. The points are colored as follows: purple stars for the first 32 points, green triangles for the next 32, and blue circles for the subsequent 64. The lattice has been randomly shifted, the digital net has been randomized with a nested uniform scramble, and the Halton points have been randomized with a linear matrix scramble and permutation scramble.
  • Figure 1.2: The relative error of approximating the mean of a $d=6$-dimensional Keister function keister.multidim_quadrature_algorithms_keister_fun for various choices of nodes. Low-discrepancy (LD) nodes for Quasi-Monte Carlo methods achieve superior convergence rates compared to IID Monte Carlo methods and grid-based cubature schemes.
  • Figure 1.3: Sketch of a Gaussian process (GP) model which predicts critical pressure in the subsurface from a controlled extraction rate. The true analytic PDE solution is unknown, and we only have access to noisy observations in the form of numerical PDE solutions. We fit a GP in blue where the posterior mean is the GP approximation and the GP uncertainty is quantified through point-wise confidence intervals derived from the posterior variance.
  • Figure 1.4: Projections of $n=64$ points in $d=3$ dimensions from a regular grid, shifted lattice point set, and digitally-shifted digital net. Notice the lattice and digital net low-discrepancy point sets more evenly fill the unit cube and have more diverse projections.
  • Figure 1.5: Comparison between standard GP regression using grid points with a squared exponential (SE) kernel and fast GP methods pairing either lattices with a shift-invariant (SI) kernel or digital nets with a DSI kernel. We model the Ackley function in $d=2$ dimensions sampled with $n=4096$ observations. Metrics for $L_2$ relative error, time per optimization step, and marginal log-likelihood (MLL) are also given.
  • ...and 42 more figures

Theorems & Definitions (142)

  • Theorem 4.7.1
  • proof
  • Definition 5.1.1: HPD and HPSD kernels
  • Definition 5.1.2: HPD and HPSD matrices
  • Theorem 5.1.3: Moore--Aronszajn
  • Lemma 5.2.2
  • proof
  • Theorem 5.2.4
  • proof
  • Corollary 5.2.5
  • ...and 132 more