Table of Contents
Fetching ...

Stereographic Spherical Sliced Wasserstein Distances

Huy Tran, Yikun Bai, Abihith Kothapalli, Ashkan Shahbazi, Xinran Liu, Rocio Diaz Martin, Soheil Kolouri

TL;DR

This work introduces Stereographic Spherical Sliced Wasserstein (S3W), a fast, scalable distance for comparing probability measures on the sphere by projecting onto Euclidean space via stereographic projection and applying a generalized Radon transform to obtain 1D marginals. It analyzes distance distortion and proposes a rotationally invariant extension (RI-S3W) and an amortized variant (ARI-S3W) to improve robustness and efficiency. The authors provide theoretical foundations for SSRT-based slices, practical numerical implementations, and extensive experiments across gradient flows, self-supervised learning, SWAE, and Earth-density estimation, demonstrating substantial speedups and competitive accuracy relative to recent baselines. The approach opens pathways for efficient, geometry-aware probabilistic modeling on spherical domains and can be extended to unbalanced or partial OT and other manifolds. Overall, S3W and its variants offer a practical, theoretically grounded toolkit for fast spherical optimal transport in machine learning and scientific computing contexts.

Abstract

Comparing spherical probability distributions is of great interest in various fields, including geology, medical domains, computer vision, and deep representation learning. The utility of optimal transport-based distances, such as the Wasserstein distance, for comparing probability measures has spurred active research in developing computationally efficient variations of these distances for spherical probability measures. This paper introduces a high-speed and highly parallelizable distance for comparing spherical measures using the stereographic projection and the generalized Radon transform, which we refer to as the Stereographic Spherical Sliced Wasserstein (S3W) distance. We carefully address the distance distortion caused by the stereographic projection and provide an extensive theoretical analysis of our proposed metric and its rotationally invariant variation. Finally, we evaluate the performance of the proposed metrics and compare them with recent baselines in terms of both speed and accuracy through a wide range of numerical studies, including gradient flows and self-supervised learning. Our code is available at https://github.com/mint-vu/s3wd.

Stereographic Spherical Sliced Wasserstein Distances

TL;DR

This work introduces Stereographic Spherical Sliced Wasserstein (S3W), a fast, scalable distance for comparing probability measures on the sphere by projecting onto Euclidean space via stereographic projection and applying a generalized Radon transform to obtain 1D marginals. It analyzes distance distortion and proposes a rotationally invariant extension (RI-S3W) and an amortized variant (ARI-S3W) to improve robustness and efficiency. The authors provide theoretical foundations for SSRT-based slices, practical numerical implementations, and extensive experiments across gradient flows, self-supervised learning, SWAE, and Earth-density estimation, demonstrating substantial speedups and competitive accuracy relative to recent baselines. The approach opens pathways for efficient, geometry-aware probabilistic modeling on spherical domains and can be extended to unbalanced or partial OT and other manifolds. Overall, S3W and its variants offer a practical, theoretically grounded toolkit for fast spherical optimal transport in machine learning and scientific computing contexts.

Abstract

Comparing spherical probability distributions is of great interest in various fields, including geology, medical domains, computer vision, and deep representation learning. The utility of optimal transport-based distances, such as the Wasserstein distance, for comparing probability measures has spurred active research in developing computationally efficient variations of these distances for spherical probability measures. This paper introduces a high-speed and highly parallelizable distance for comparing spherical measures using the stereographic projection and the generalized Radon transform, which we refer to as the Stereographic Spherical Sliced Wasserstein (S3W) distance. We carefully address the distance distortion caused by the stereographic projection and provide an extensive theoretical analysis of our proposed metric and its rotationally invariant variation. Finally, we evaluate the performance of the proposed metrics and compare them with recent baselines in terms of both speed and accuracy through a wide range of numerical studies, including gradient flows and self-supervised learning. Our code is available at https://github.com/mint-vu/s3wd.
Paper Structure (58 sections, 21 theorems, 209 equations, 40 figures, 8 tables, 5 algorithms)

This paper contains 58 sections, 21 theorems, 209 equations, 40 figures, 8 tables, 5 algorithms.

Key Result

Proposition 1

For $\mu\in \mathcal{M}(\mathbb{S}^{d})$ that does not give mass to the North Pole $\{s_n\}$ the Stereographic Spherical Radon transforms $\mathcal{S}_\mathcal{G}$ and $\mathcal{S}_\mathcal{H}$ satisfy the following properties:

Figures (40)

  • Figure 1: Depiction of stereographic projection from $\mathbb{S}^2\backslash \{s_n\}$ to $\mathbb{R}^2$ (a), the stereographic Radon transform integration surfaces on the sphere, i.e., the level sets of $\langle \phi(x),\theta\rangle$ for a fixed $\theta\in \mathbb{R}^d$ (b), and the generalized stereographic Radon transform integration surfaces on the sphere, i.e. the level sets of $\langle h\circ\phi(x),\theta\rangle$ for a fixed $\theta\in \mathbb{R}^{d'}$.
  • Figure 2: Spherical distance (i.e., the arclength) versus the distance after stereographic projection, where CC denotes Pearson's correlation coefficient. From left to right, when the injective function $h=id$, and the distance is $\|\phi(s)-\phi(s')\|$ (a), when $h(x)=h_1(x)$ (see Eq. \ref{['eq: h_1']}) and the distance is $\|h(\phi(s))-h(\phi(s'))\|$ (b), when $h(x)=h_1(x)$ and the distance is $\min(\|h(\phi(s))-h(\phi(s'))\|,2\pi-\|h(\phi(s))-h(\phi(s'))\|)$ (c), and finally when $h(x)=h_{NN}(x)$ (see Eq. \ref{['eq:h_NN']}) where $\rho(x)$ is a trained neural network minimizing Eq. \ref{['eq: loss_rho']} and $C\geq 2\pi$ (d).
  • Figure 3: Runtime comparison for Wasserstein distance, Sinkhorn distance cuturi2013sinkhorn with geodesic distance as cost function, $SW_2$ (sliced Wasserstein) distance, $SSW_1$ distance (using level median formula) bonet2022spherical, $SSW_2$ distance with binary search (BS) and antipodal closed form (for uniform distribution) bonet2022spherical, $S3W_2$ distance (ours), $RI\text{-}S3W_2$ distance (ours), and $ARI\text{-}S3W_2$ distance (ours).
  • Figure 4: Learning a mixture of $12$ vMFs. $ARI\text{-}S3W$ (30) has $30$ rotations, pool size of $1000$. $S3W$ variants use $\text{LR}=0.01$. $SSW$ has an additional $\text{LR}=0.05$ for better comparison. The plots show convergence of different distances w.r.t. iterations and runtime. The table summarizes numerical results for $10$ independent runs. We provide more details of the plots in the appendix Section \ref{['subsec:evo_loss_curve']}.
  • Figure 5: Projected features on $\mathbb{S}^2$ for CIFAR-10.
  • ...and 35 more figures

Theorems & Definitions (42)

  • Proposition 1
  • Theorem 2
  • Proposition 3
  • Theorem 4
  • Theorem 5: Riesz-Markov Representation theorem
  • Lemma 6: Identity of Radon measures
  • proof
  • Proposition 7
  • proof
  • Proposition 8
  • ...and 32 more