Table of Contents
Fetching ...

Generative multi-scale modeling and downscaling via spatial autoregressive transport maps

Alejandro Calle-Saldarriaga, Paul F. V. Wiemann, Matthias Katzfuss

Abstract

Spatial fields in the Earth and environmental sciences are often available at multiple scales or resolutions. While coarse-scale data (e.g., from global circulation models) are often abundant, they lack the local detail provided by fine-scale data (e.g., from regional climate models), which are typically computationally expensive to generate. Statistical downscaling and multi-scale data fusion address this challenge by predicting high-resolution fields from low-resolution or related inputs. We propose a highly scalable Bayesian approach that can learn the joint non-Gaussian distribution and nonlinear dependence structure of nonstationary spatial fields across multiple scales from a small number of training samples. Our method employs scale-aware autoregressive Gaussian processes with suitably chosen regularization-inducing priors to model the conditional distribution of fine-scale fields given coarse-scale data. Exploiting conjugacy, the integrated likelihood is available in closed form, enabling efficient parameter optimization via stochastic gradient descent. Once trained, the method provides a closed-form characterization of the posterior distribution of fine-scale fields given coarse-scale inputs. In numerical comparisons, we demonstrate that our approach substantially outperforms existing methods and effectively characterizes and simulates fine-scale climate behavior based on output from coarse global circulation models.

Generative multi-scale modeling and downscaling via spatial autoregressive transport maps

Abstract

Spatial fields in the Earth and environmental sciences are often available at multiple scales or resolutions. While coarse-scale data (e.g., from global circulation models) are often abundant, they lack the local detail provided by fine-scale data (e.g., from regional climate models), which are typically computationally expensive to generate. Statistical downscaling and multi-scale data fusion address this challenge by predicting high-resolution fields from low-resolution or related inputs. We propose a highly scalable Bayesian approach that can learn the joint non-Gaussian distribution and nonlinear dependence structure of nonstationary spatial fields across multiple scales from a small number of training samples. Our method employs scale-aware autoregressive Gaussian processes with suitably chosen regularization-inducing priors to model the conditional distribution of fine-scale fields given coarse-scale data. Exploiting conjugacy, the integrated likelihood is available in closed form, enabling efficient parameter optimization via stochastic gradient descent. Once trained, the method provides a closed-form characterization of the posterior distribution of fine-scale fields given coarse-scale inputs. In numerical comparisons, we demonstrate that our approach substantially outperforms existing methods and effectively characterizes and simulates fine-scale climate behavior based on output from coarse global circulation models.

Paper Structure

This paper contains 28 sections, 22 equations, 8 figures.

Figures (8)

  • Figure 1: Our goal is to learn the non-Gaussian joint and conditional distributions of spatial fields at multiple scales from a small number of training samples. Here we show an example of two pairs of low-resolution GCM temperature fields at $N_1 = 336$ coarse pixels over Europe driving high-resolution RCM samples on a fine grid of size $N_2 = 280 \times 280 = 78{,}400$; see Section \ref{['sec:climate']} for more details. We want to learn the conditional $N_2$-dimensional distribution given a $N_1$-dimensional field from $n \leq 40$ training sample pairs.
  • Figure 2: Conditional maximin ordering and conditioning sets for locations (gray dots) on a low-resolution grid of size $N_1 = 10 \times 10 = 100$ (left and middle panel) and a high-resolution grid (right panel) of size $N_2 = 30 \times 30 = 900$. For (a) $i=35$ and (b) $i=185$, the $i$th ordered location ($\color{blue}{+}$), the $i-1$ previously ordered locations ($\circ$), the $m_r=4$ nearest previously ordered locations $c_{ri}$ at the same scale ($\color{green}{\times}$ in the left and right panels), and the $m_r'=4$ nearest locations $c'_{ri}$ to $\mathbf{s}_{185}$ in the lower scale $\mathcal{S}_1$ ($\color{green}{\times}$ in the middle panel).
  • Figure 3: For a GP with exponential covariance on coarser-to-finer grids and with the maximin ordering in Figure \ref{['fig:maximin']}, the map components \ref{['eq:mf-tm-component']} can be written as $f_{r,i}(\mathbf{y}_{r, <i}, \mathbf{y}_{r-1}) = \sum_{k=1}^{i-1} y_{r,c_{r,i}(k)} b_{r,i,k} + \sum_{k=1}^{N_{r-1}} y_{r-1,c^\prime_{r,i}(k)} b^\prime_{r,i,k}. \blacktriangleleft\blacktriangleleft$$\blacktriangleleft$$\blacktriangleleft$For the $i$th location in the $r$th-scale, $c_{r,i}(k)$ indicates the $k$th previously ordered nearest neighbor in the same resolution, and $b_{r,i,k}$ the corresponding kriging weight; while $c^\prime_{r,i}(k)$ indicates the $k$-th nearest neighbor in the previous resolution, with $b^\prime_{r,i,k}$ as the corresponding kriging weight. (First row): Sample from the process. (Second row): The conditional standard deviations $d_{r,i}$ decay polynomially as a function of $\ell_{r,i}$. (The sudden drop around $i=3{,}200$ is due to grids at different resolutions sharing locations.) (Third row): Squared kriging weights in the same resolution decay rapidly in neighbor number. (Fourth row): Squared kriging weights in previous resolution also decay rapidly with neighbor number.
  • Figure 4: For downscaling of (linear) block averages, samples from our BTM mimic those from the true data-generating process, and they get closer to test samples when conditioning on more lower-scale test fields. The first two rows show test samples from the block-averaging data-generating process described in Section \ref{['sec:coarsening']}. The next three rows are samples from our model (trained on $50$ samples). Arrows represent conditioning relationships (e.g., the high-resolution BTM sample in row 3 is drawn from the conditional distribution given the middle-resolution test sample in row 2). More concretely, comparing column 3 in rows 2, 3, 4, rows 3 and 4 are able to reproduce the global and local features in row 2 (e.g., high values in the top right), with row 3 matching the "true" sample in row 2 more closely, as it conditions on more information than row 4.
  • Figure 5: Our multi-scale Bayesian transport map (MS BTM) performs best in terms of log-scores for all sample sizes $n$ in two simulation scenarios. (Left): Block-averaging scenario from Section \ref{['sec:coarsening']}. (Right): (Nonlinear) block minima from Section \ref{['sec:min']}. The competing methods are listed in Section \ref{['sec:metrics']}.
  • ...and 3 more figures