Distribution-on-Distribution Regression with Wasserstein Metric: Multivariate Gaussian Case

Ryo Okano; Masaaki Imaizumi

Distribution-on-Distribution Regression with Wasserstein Metric: Multivariate Gaussian Case

Ryo Okano, Masaaki Imaizumi

TL;DR

This study introduces models for regression from one Gaussian distribution to another, utilizing the Wasserstein metric, and establishes the convergence rates of in-sample prediction errors for the empirical risk minimizations in the models.

Abstract

Distribution data refers to a data set where each sample is represented as a probability distribution, a subject area receiving burgeoning interest in the field of statistics. Although several studies have developed distribution-to-distribution regression models for univariate variables, the multivariate scenario remains under-explored due to technical complexities. In this study, we introduce models for regression from one Gaussian distribution to another, utilizing the Wasserstein metric. These models are constructed using the geometry of the Wasserstein space, which enables the transformation of Gaussian distributions into components of a linear matrix space. Owing to their linear regression frameworks, our models are intuitively understandable, and their implementation is simplified because of the optimal transport problem's analytical solution between Gaussian distributions. We also explore a generalization of our models to encompass non-Gaussian scenarios. We establish the convergence rates of in-sample prediction errors for the empirical risk minimizations in our models. In comparative simulation experiments, our models demonstrate superior performance over a simpler alternative method that transforms Gaussian distributions into matrices. We present an application of our methodology using weather data for illustration purposes.

Distribution-on-Distribution Regression with Wasserstein Metric: Multivariate Gaussian Case

TL;DR

Abstract

Paper Structure (31 sections, 6 theorems, 90 equations, 7 figures, 2 tables, 1 algorithm)

This paper contains 31 sections, 6 theorems, 90 equations, 7 figures, 2 tables, 1 algorithm.

Introduction
Related Studies
Notation
Background
Optimal Transport
The Wasserstein Space and its Tangent Space
Specification with Gaussian Case
Model
Idea: Nearly isometry between Gaussian Space and Linear Matrix Space
Regression Model
Basic model
Low-Rank Model
Comparison with Existing Models in Terms of Generalization to Multivariate Case
Generalization to Elliptically Symmetric Distributions
Empirical Risk Minimization Algorithms
...and 16 more sections

Key Result

Proposition 1

Let $\mu_\ast \in \mathcal{G}(\mathbb{R}^d)$ be an arbitrary fixed reference measure. For any $\mu \in \mathcal{G}(\mathbb{R}^d)$, we have Moreover, if $\mu_\ast \in \mathscr{C}_U$ holds, we have the following for any $\mu_1, \mu_2 \in \mathscr{C}_U$:

Figures (7)

Figure 1: Illustration of structure of the proposed regression model between the Gaussain spaces $\mathcal{G}(\mathbb{R}^{d_1})$ and $\mathcal{G}(\mathbb{R}^{d_2})$. The Gaussian distributions $\nu_1$ and $\nu_2$ are transformed to the random elements $X$ and $Y$ in the linear matrix spaces $\Xi_{d_1}$ and $\Xi_{d_2}$ by the nearly isometric maps $\varphi_{\nu_{1\oplus}}$ and $\varphi_{\nu_{2\oplus}}$, respectively. Then, linear regression model with regression map $\Gamma_{\mathbb{B}_0}$ is assumed between $X$ and $Y$.
Figure 2: Boxplots of the out-of-sample AWDs defined as \ref{['eq:awd']} for the four scenarios with $n \in \{50, 200\}$ and $N \in \{50, 500\}$. "proposed" denotes the proposed method and "alternative" denotes the alternative method. The number in brackets "[ ]" below the boxplots for the proposed indicates how many runs event \ref{['eq:awd']} happened and boundary projection was needed.
Figure 3: Boxplots of the out-of-sample AWDs defined as \ref{['eq:awd']} for the low-rank methods with rank $K \in \{2, 3, 4\}$. "proposed" denotes the proposed method and "alternative" denotes the alternative method. The number in brackets "[ ]" below the boxplots for the proposed indicates how many runs event \ref{['eq:awd']} happened and boundary projection was needed.
Figure 4: Boxplots of the out-of-sample AWDs defined as \ref{['eq:awd']} for the three scenarios with the degree of the freedom $\ell \in \{5, 10, 15\}$. "proposed" denotes the proposed method and "alternative" denotes the alternative method. The number in brackets "[ ]" below the boxplots for the proposed indicates how many runs event \ref{['eq:fall']} happened and boundary projection was needed.
Figure 5: Observed data and estimated Gaussian joint densities of the average temperatures and average relative humidity in spring (top row) and summer (bottom row) from 1953 to 1956. Black points are observed data and solid lines are contour lines of estimated densities.
...and 2 more figures

Theorems & Definitions (13)

Proposition 1
Remark 1: Scalar response model
Theorem 1: Basic Model
Theorem 2: Rank-$K$ Model
proof : Proof of Proposition 1
Theorem 3: park2023towards, Section 4.1
proof : Proof of Theorem 1
proof : Proof of Theorem 2
Theorem 4: Consistency of Estimator
proof
...and 3 more

Distribution-on-Distribution Regression with Wasserstein Metric: Multivariate Gaussian Case

TL;DR

Abstract

Distribution-on-Distribution Regression with Wasserstein Metric: Multivariate Gaussian Case

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (13)