Sliced Wasserstein Estimation with Control Variates

Khai Nguyen; Nhat Ho

Sliced Wasserstein Estimation with Control Variates

Khai Nguyen, Nhat Ho

TL;DR

This work tackles the high variance in Monte Carlo estimates of the sliced Wasserstein distance by introducing control variates. It builds Gaussian approximations of the projected measures and leverages the closed-form $W_2^2$ between Gaussians to construct a lower-bound and an upper-bound control variate, preserving linear-time computation. The resulting CV-SW estimators demonstrably reduce variance in image and point-cloud comparisons, accelerate point-cloud gradient flows, and improve deep generative modeling metrics on CIFAR10 and CelebA. The approach is broadly applicable to SW variants and offers a practical variance-reduction tool for high-dimensional distribution comparisons.

Abstract

The sliced Wasserstein (SW) distances between two probability measures are defined as the expectation of the Wasserstein distance between two one-dimensional projections of the two measures. The randomness comes from a projecting direction that is used to project the two input measures to one dimension. Due to the intractability of the expectation, Monte Carlo integration is performed to estimate the value of the SW distance. Despite having various variants, there has been no prior work that improves the Monte Carlo estimation scheme for the SW distance in terms of controlling its variance. To bridge the literature on variance reduction and the literature on the SW distance, we propose computationally efficient control variates to reduce the variance of the empirical estimation of the SW distance. The key idea is to first find Gaussian approximations of projected one-dimensional measures, then we utilize the closed-form of the Wasserstein-2 distance between two Gaussian distributions to design the control variates. In particular, we propose using a lower bound and an upper bound of the Wasserstein-2 distance between two fitted Gaussians as two computationally efficient control variates. We empirically show that the proposed control variate estimators can help to reduce the variance considerably when comparing measures over images and point-clouds. Finally, we demonstrate the favorable performance of the proposed control variate estimators in gradient flows to interpolate between two point-clouds and in deep generative modeling on standard image datasets, such as CIFAR10 and CelebA.

Sliced Wasserstein Estimation with Control Variates

TL;DR

between Gaussians to construct a lower-bound and an upper-bound control variate, preserving linear-time computation. The resulting CV-SW estimators demonstrably reduce variance in image and point-cloud comparisons, accelerate point-cloud gradient flows, and improve deep generative modeling metrics on CIFAR10 and CelebA. The approach is broadly applicable to SW variants and offers a practical variance-reduction tool for high-dimensional distribution comparisons.

Abstract

Paper Structure (24 sections, 5 theorems, 34 equations, 7 figures, 4 tables, 3 algorithms)

This paper contains 24 sections, 5 theorems, 34 equations, 7 figures, 4 tables, 3 algorithms.

Introduction
Background
Control Variate Sliced Wasserstein Estimators
Control Variate for Sliced Wasserstein Distance
Constructing Control Variates
Experiments
Comparing empirical probability measures over images and point-clouds
Point Cloud Gradient Flows
Deep Generative Modeling
Conclusion
Discussion
Non-parametric hypothesis testing
Proofs
Proof of Proposition \ref{['prop:Gaussianapproximation']}
Proof of Proposition \ref{['prop:calcuating']}
...and 9 more sections

Key Result

Proposition 1

Let $\mu$ and $\mu$ be two discrete probability measures i.e., $\mu= \sum_{i=1}^n \alpha_i \delta_{x_i}$ ($\sum_{i=1}^n\alpha_i=1$) and $\nu= \sum_{i=1}^m \beta_i\delta_{y_i}$ ($\sum_{i=1}^m\beta_i=1$), we have: $m_1(\theta;\mu) = \sum_{i=1}^n \alpha_i \theta^\top x_i$, $\sigma_1^2(\theta;\mu) =

Figures (7)

Figure 1: The empirical errors of the conventional estimator (SW) and the control variate estimators (LCV-SW, UCV-SW) when comparing empirical distributions over MNIST images and point-clouds.
Figure 2: Point-cloud gradient flows for $L=10$ from SW, LCV-SW, and UCV-SW respectively.
Figure 3: Random generated images of distances on CIFAR10 and CelebA.
Figure 4: Testing results on Gaussian distributions across different choices of dimension D. Left: power for Gaussian distributions, where the shifted covariance matrix is still diagonal; Middle: power for Gaussian distributions, where the shifted covariance matrix is non-diagonal; Right: Type-I error.
Figure 5: The empirical errors of the conventional estimator (SW) and the control variate estimators (LCV-SW, UCV-SW) when comparing empirical distributions over MNIST images and point-clouds.
...and 2 more figures

Theorems & Definitions (12)

Definition 1
Definition 2
Remark 1
Proposition 1
Proposition 2
Definition 3
Proposition 3
Proposition 4
Definition 4
Definition 5
...and 2 more

Sliced Wasserstein Estimation with Control Variates

TL;DR

Abstract

Sliced Wasserstein Estimation with Control Variates

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (12)