Data-Efficient Generative Modeling of Non-Gaussian Global Climate Fields via Scalable Composite Transformations

Johannes Brachem; Paul F. V. Wiemann; Matthias Katzfuss

Data-Efficient Generative Modeling of Non-Gaussian Global Climate Fields via Scalable Composite Transformations

Johannes Brachem, Paul F. V. Wiemann, Matthias Katzfuss

TL;DR

This work proposes a data-efficient framework for emulating the internal variability of global climate fields, specifically designed to overcome sample-size constraints, and constructs a highly expressive joint distribution via a composite transformation to a multivariate standard normal space.

Abstract

Quantifying uncertainty in future climate projections is hindered by the prohibitive computational cost of running physical climate models, which severely limits the availability of training data. We propose a data-efficient framework for emulating the internal variability of global climate fields, specifically designed to overcome these sample-size constraints. Inspired by copula modeling, our approach constructs a highly expressive joint distribution via a composite transformation to a multivariate standard normal space. We combine a nonparametric Bayesian transport map for spatial dependence modeling with flexible, spatially varying marginal models, essential for capturing non-Gaussian behavior and heavy-tailed extremes. These marginals are defined by a parametric model followed by a semi-parametric B-spline correction to capture complex distributional features. The marginal parameters are spatially smoothed using Gaussian-process priors with low-rank approximations, rendering the computational cost linear in the spatial dimension. When applied to global log-precipitation-rate fields at more than 50,000 grid locations, our stochastic surrogate achieves high fidelity, accurately quantifying the climate distribution's spatial dependence and marginal characteristics, including the tails. Using only 10 training samples, it outperforms a state-of-the-art competitor trained on 80 samples, effectively octupling the computational budget for climate research. We provide a Python implementation at https://github.com/jobrachem/ppptm .

Data-Efficient Generative Modeling of Non-Gaussian Global Climate Fields via Scalable Composite Transformations

TL;DR

Abstract

Paper Structure (27 sections, 7 theorems, 20 equations, 8 figures, 1 table)

This paper contains 27 sections, 7 theorems, 20 equations, 8 figures, 1 table.

Introduction
Model
Overview
Copula-inspired separation of marginal and dependence models
Details on the parametric transformations G
Details on the semi-parametric transformations H
Regularization for the semi-parametric transformations
The onion prior for local regularization.
Spatial smoothing.
Maximin ordering of y
Review of the scalable Bayesian transport map T
Parameter estimation and computational complexity
Stage 1: Scalable marginal-model estimation
Low-rank approximation.
Estimation.
...and 12 more sections

Key Result

Theorem 1

Let $(\mathcal{H}_i \circ \mathcal{G}_i)(y_i) \sim \mathcal{N}(0, 1)$, with $\mathcal{H}_i$ as defined above and $\mathcal{G}_i(y_i)=\Phi^{-1}(F_i(y_i))$, where $F_i: \mathbb{R} \rightarrow [0,1]$ is a continuous cumulative distribution function. If $\boldsymbol{\beta}_i = \beta_i \mathbf{1}_D$ for

Figures (8)

Figure 1: Top: Conceptual illustration of the three-part composite transformation mapping a bivariate non-Gaussian distribution to a standard Gaussian distribution. The parametric marginal model $\mathcal{G}$ (here, a location-scale $t$-distribution), standardizes the margins and brings the tails closer to the bulk of the distribution. The semi-parametric model $\mathcal{H}$ corrects residual deviations from marginal Gaussianity. Finally, the Bayesian transport map $\mathcal{T}$ captures and removes the nonlinear dependence, achieving joint Gaussianity. Bottom: Analogous transformation of a global log-precipitation-rate field produced by a climate model (see main text). Due to the joint estimation of $\mathcal{G}$ and $\mathcal{H}$ here, the behavior of $\mathcal{G}$ cannot be easily interpreted separately from $\mathcal{H}$.
Figure 2: Illustration of how parameters and knots in our modified monotonically increasing B-spline relate to each other. The bottom panel (orange) shows the knots. The middle panel (blue) shows the spline parameters on the level of $\gamma_1, \dots, \gamma_J$, aligned with corresponding knots. The top panel (green) shows the values of the fixed $\gamma$ parameters, and aligns the freely estimated parameters $\beta_1, \dots, \beta_D$ with their counterparts $\gamma_5, \dots, \gamma_{J-3}$ (see \ref{['eq:onion-coefficients']}). Note that $k = k_{j+1}-k_j$ is the (constant) distance between two adjacent knots, and that $J =D+7$ and $m = D+5$.
Figure 3: Panel a) illustrates our modified spline using a single random sample of $\boldsymbol{\beta}_i$, which is transformed into $\boldsymbol{\gamma}$ using \ref{['eq:onion-coefficients']}. The points are the B-spline control points $\gamma_{1,i} + \sum_{\ell=2}^j \exp(\gamma_{\ell,i})$, aligned with the knots $k_0, \dots, k_{m+1}$. Panels b) and c) each show 50 prior predictive samples of $\mathcal{H}_i$ obtained by drawing $\boldsymbol{\beta}_i$ from our onion prior with variance parameters $\tau^2=1$ (b) and $\tau^2=0.01$ (c).
Figure 4: Illustration of the maximin ordering of locations on a global grid, showing the first $M$ locations in red. In the left and middle panels, numbers indicate the position in the ordering. The maximin criterion places each new point to maximize the chordal distance to all preceding points, ensuring uniform global coverage. Apparent asymmetries in the early points arise from projecting this spherical optimization onto a 2D plane for plotting.
Figure 5: Our SCT models are substantially more accurate climate emulators than existing methods in terms of log scores (lower is better). The top row shows average predictive log scores over five random train/test splits. Dashed lines indicate models that include a fitted semi-parametric correction layer $\mathcal{H}$. Results marked with an asterisk (*) are taken from Katzfuss2023-ScalableBayesianTransport, Fig. 10, without re-running the models. The bottom row shows predictive log scores in the SCT-Skew-t specification (including a fitted semi-parametric layer $\mathcal{H}$) as a function of the number of inducing points $M$ and training samples ${N_{\text{train}}}$.
...and 3 more figures

Theorems & Definitions (14)

Theorem 1: Reduction to base model
Theorem 2: Tail probabilities
Lemma 1: $\mathcal{H}_i$ passes through $k_1$
proof
Lemma 2: Unit slope on $[k_1, a]$ and $[b, k_m]$
proof
Lemma 3: Average slope of one over $[a, b]$
proof
Lemma 4: $\mathcal{H}_i$ passes through $a$ and $b$
proof
...and 4 more

Data-Efficient Generative Modeling of Non-Gaussian Global Climate Fields via Scalable Composite Transformations

TL;DR

Abstract

Data-Efficient Generative Modeling of Non-Gaussian Global Climate Fields via Scalable Composite Transformations

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (14)