Sliced Wasserstein Estimation with Control Variates
Khai Nguyen, Nhat Ho
TL;DR
This work tackles the high variance in Monte Carlo estimates of the sliced Wasserstein distance by introducing control variates. It builds Gaussian approximations of the projected measures and leverages the closed-form $W_2^2$ between Gaussians to construct a lower-bound and an upper-bound control variate, preserving linear-time computation. The resulting CV-SW estimators demonstrably reduce variance in image and point-cloud comparisons, accelerate point-cloud gradient flows, and improve deep generative modeling metrics on CIFAR10 and CelebA. The approach is broadly applicable to SW variants and offers a practical variance-reduction tool for high-dimensional distribution comparisons.
Abstract
The sliced Wasserstein (SW) distances between two probability measures are defined as the expectation of the Wasserstein distance between two one-dimensional projections of the two measures. The randomness comes from a projecting direction that is used to project the two input measures to one dimension. Due to the intractability of the expectation, Monte Carlo integration is performed to estimate the value of the SW distance. Despite having various variants, there has been no prior work that improves the Monte Carlo estimation scheme for the SW distance in terms of controlling its variance. To bridge the literature on variance reduction and the literature on the SW distance, we propose computationally efficient control variates to reduce the variance of the empirical estimation of the SW distance. The key idea is to first find Gaussian approximations of projected one-dimensional measures, then we utilize the closed-form of the Wasserstein-2 distance between two Gaussian distributions to design the control variates. In particular, we propose using a lower bound and an upper bound of the Wasserstein-2 distance between two fitted Gaussians as two computationally efficient control variates. We empirically show that the proposed control variate estimators can help to reduce the variance considerably when comparing measures over images and point-clouds. Finally, we demonstrate the favorable performance of the proposed control variate estimators in gradient flows to interpolate between two point-clouds and in deep generative modeling on standard image datasets, such as CIFAR10 and CelebA.
