Table of Contents
Fetching ...

Linear-Cost Vecchia Approximation of Multivariate Normal Probabilities

Jian Cao, Matthias Katzfuss

TL;DR

This work addresses the computational bottleneck of evaluating high-dimensional MVN probabilities and sampling from TMVN distributions. It introduces VMET, a Vecchia-based reparameterization of the minimax exponential tilting framework, achieving linear-time Monte Carlo sampling and gradient/Hessian estimation while preserving MET's accuracy, even in large-scale problems. Key contributions include a sparse inverse-Cholesky representation via Vecchia, a linear-time solver for the exponential-tilting parameters, a Vecchia-based variable reordering, and optional multi-level MC and regional TMVN sampling for partially censored data. Empirical results on simulations and a 24,701-point groundwater dataset demonstrate substantial speedups, improved accuracy, numerical stability, and practical applicability to partially censored Gaussian-process models; the authors also provide an R package VeccTMVN for public use.

Abstract

Multivariate normal (MVN) probabilities arise in myriad applications, but they are analytically intractable and need to be evaluated via Monte-Carlo-based numerical integration. For the state-of-the-art minimax exponential tilting (MET) method, we show that the complexity of each of its components can be greatly reduced through an integrand parameterization that utilizes the sparse inverse Cholesky factor produced by the Vecchia approximation, whose approximation error is often negligible relative to the Monte-Carlo error. Based on this idea, we derive algorithms that can estimate MVN probabilities and sample from truncated MVN distributions in linear time (and that are easily parallelizable) at the same convergence or acceptance rate as MET, whose complexity is cubic in the dimension of the MVN probability. We showcase the advantages of our methods relative to existing approaches using several simulated examples. We also analyze a groundwater-contamination dataset with over twenty thousand censored measurements to demonstrate the scalability of our method for partially censored Gaussian-process models.

Linear-Cost Vecchia Approximation of Multivariate Normal Probabilities

TL;DR

This work addresses the computational bottleneck of evaluating high-dimensional MVN probabilities and sampling from TMVN distributions. It introduces VMET, a Vecchia-based reparameterization of the minimax exponential tilting framework, achieving linear-time Monte Carlo sampling and gradient/Hessian estimation while preserving MET's accuracy, even in large-scale problems. Key contributions include a sparse inverse-Cholesky representation via Vecchia, a linear-time solver for the exponential-tilting parameters, a Vecchia-based variable reordering, and optional multi-level MC and regional TMVN sampling for partially censored data. Empirical results on simulations and a 24,701-point groundwater dataset demonstrate substantial speedups, improved accuracy, numerical stability, and practical applicability to partially censored Gaussian-process models; the authors also provide an R package VeccTMVN for public use.

Abstract

Multivariate normal (MVN) probabilities arise in myriad applications, but they are analytically intractable and need to be evaluated via Monte-Carlo-based numerical integration. For the state-of-the-art minimax exponential tilting (MET) method, we show that the complexity of each of its components can be greatly reduced through an integrand parameterization that utilizes the sparse inverse Cholesky factor produced by the Vecchia approximation, whose approximation error is often negligible relative to the Monte-Carlo error. Based on this idea, we derive algorithms that can estimate MVN probabilities and sample from truncated MVN distributions in linear time (and that are easily parallelizable) at the same convergence or acceptance rate as MET, whose complexity is cubic in the dimension of the MVN probability. We showcase the advantages of our methods relative to existing approaches using several simulated examples. We also analyze a groundwater-contamination dataset with over twenty thousand censored measurements to demonstrate the scalability of our method for partially censored Gaussian-process models.
Paper Structure (20 sections, 6 theorems, 19 equations, 8 figures, 1 table, 2 algorithms)

This paper contains 20 sections, 6 theorems, 19 equations, 8 figures, 1 table, 2 algorithms.

Key Result

Proposition 1

Suppose $\mathbf{x} \sim \phi_{n}(\mathbf{x}; \bm{\Sigma})$, $\mathbf{L}$ is the lower Cholesky factor of $\bm{\Sigma}$, and $\mathbf{x} = \mathbf{L} \mathbf{y}$. Then

Figures (8)

  • Figure 1: Comparison of FIC-based variable reordering ($m = 30$, $100$, $200$), Vecchia-based variable reordering ($m = 30$), and univariate reordering for the three scenarios described in Table \ref{['tbl:low_dim_exp']}. Each boxplot consists of $30$ estimates using the VMET integrand.
  • Figure 2: For the scenarios in Table \ref{['tbl:low_dim_exp']} with $n=900$, we show the RMSE of 30 probability estimates against the computation time (both on a log scale), for VMET, VCDF, TLR, SOV, and MET. VMET is run with $m \in \{10, 30, 50, 70, 90\}$ and $N = 10^{4}$, SOV and TLR are run with $N \in \{1, 3, 5, 7, 9\} \times 10^{4}$, and MET and VCDF are run with $N \in \{6, 7, 8, 9, 10\} \times 10^{3}$. An MET estimate with $N = 10^{7}$ is used as the truth.
  • Figure 3: For the scenarios in Table \ref{['tbl:low_dim_exp']} with $n=6{,}400$, we show the RMSE of 30 log-probability estimates against the computation time (both on a log scale), for VMET, TLR, and SOV. VMET is run with $m \in \{30, 50, 70\}$ and $N = 10^{5}$, SOV and TLR are run with $N \in \{2, 5, 10\} \times 10^{4}$. For Scenarios 1 and 2, the average of VMET estimates at $m = 70$ is used as truth. For Scenario 3, the average of SOV estimates at $N = 10^{5}$ is used as truth.
  • Figure 4: The RMSE of 30 probability estimates (on a log scale) against computation time under a constant correlation structure. In this simulation, $n = 900$, $\mathbf{a} = -\infty$, $\mathbf{b} = \mathbf{0}$, and $\bm{\Sigma}$ is a correlation matrix with a constant correlation of $0.5$. The left panel compares VMET, VMET with multi-level MC (VMET-ML), MET, TLR, SOV, and VCDF, while the right panel compares only VMET and VMET-ML to highlight their difference in the error-and-computation-cost trade-off. VMET is run with $m \in [10, 90]$ and $N = 10^{4}$, SOV and TLR are run with $N \in \{1, 3, 5, 7, 9\} \times 10^{4}$, and MET and VCDF are run with $N \in \{6, 7, 8, 9, 10\} \times 10^{3}$. The truth in this case is analytically available: $\Phi_n(\mathbf{a}, \mathbf{b}; \bm{\Sigma}) = 1/(n+1)$.
  • Figure 5: Histograms and q-q plot of the TMVN samples generated by our proposed VMET method and the MET method botev2017normal
  • ...and 3 more figures

Theorems & Definitions (6)

  • Proposition 1
  • Proposition 2
  • Proposition 3: informal version
  • Proposition 4
  • Proposition 5
  • Proposition 6