Table of Contents
Fetching ...

Distance-Based Tree-Sliced Wasserstein Distance

Hoang V. Tran, Khoi N. M. Nguyen, Trang Pham, Thanh T. Chu, Tam Le, Tan M. Nguyen

TL;DR

The paper tackles the computational and invariance limitations of OT in Euclidean spaces by introducing Distance-based Tree-Sliced Wasserstein (Db-TSW), a metric that preserves topological information via tree systems and a novel $E(d)$-invariant splitting map. It defines a generalized Radon Transform on Systems of Lines with injectivity under $ ext{E}(d)$-invariance, and proves that Db-TSW is a true Euclidean-invariant metric whose computation is GPU-friendly through a simple Monte Carlo sampling scheme. Theoretical contributions include injectivity results and a formal metric proof, while empirical demonstrations across diffusion-models, gradient flows, and color transfer show improved accuracy with modest computational overhead compared to recent SW variants. Overall, Db-TSW provides a scalable, topology-aware OT tool suitable for high-dimensional data analysis and modern generative modeling.

Abstract

To overcome computational challenges of Optimal Transport (OT), several variants of Sliced Wasserstein (SW) has been developed in the literature. These approaches exploit the closed-form expression of the univariate OT by projecting measures onto (one-dimensional) lines. However, projecting measures onto low-dimensional spaces can lead to a loss of topological information. Tree-Sliced Wasserstein distance on Systems of Lines (TSW-SL) has emerged as a promising alternative that replaces these lines with a more advanced structure called tree systems. The tree structures enhance the ability to capture topological information of the metric while preserving computational efficiency. However, at the core of TSW-SL, the splitting maps, which serve as the mechanism for pushing forward measures onto tree systems, focus solely on the position of the measure supports while disregarding the projecting domains. Moreover, the specific splitting map used in TSW-SL leads to a metric that is not invariant under Euclidean transformations, a typically expected property for OT on Euclidean space. In this work, we propose a novel class of splitting maps that generalizes the existing one studied in TSW-SL enabling the use of all positional information from input measures, resulting in a novel Distance-based Tree-Sliced Wasserstein (Db-TSW) distance. In addition, we introduce a simple tree sampling process better suited for Db-TSW, leading to an efficient GPU-friendly implementation for tree systems, similar to the original SW. We also provide a comprehensive theoretical analysis of proposed class of splitting maps to verify the injectivity of the corresponding Radon Transform, and demonstrate that Db-TSW is an Euclidean invariant metric. We empirically show that Db-TSW significantly improves accuracy compared to recent SW variants while maintaining low computational cost via a wide range of experiments.

Distance-Based Tree-Sliced Wasserstein Distance

TL;DR

The paper tackles the computational and invariance limitations of OT in Euclidean spaces by introducing Distance-based Tree-Sliced Wasserstein (Db-TSW), a metric that preserves topological information via tree systems and a novel -invariant splitting map. It defines a generalized Radon Transform on Systems of Lines with injectivity under -invariance, and proves that Db-TSW is a true Euclidean-invariant metric whose computation is GPU-friendly through a simple Monte Carlo sampling scheme. Theoretical contributions include injectivity results and a formal metric proof, while empirical demonstrations across diffusion-models, gradient flows, and color transfer show improved accuracy with modest computational overhead compared to recent SW variants. Overall, Db-TSW provides a scalable, topology-aware OT tool suitable for high-dimensional data analysis and modern generative modeling.

Abstract

To overcome computational challenges of Optimal Transport (OT), several variants of Sliced Wasserstein (SW) has been developed in the literature. These approaches exploit the closed-form expression of the univariate OT by projecting measures onto (one-dimensional) lines. However, projecting measures onto low-dimensional spaces can lead to a loss of topological information. Tree-Sliced Wasserstein distance on Systems of Lines (TSW-SL) has emerged as a promising alternative that replaces these lines with a more advanced structure called tree systems. The tree structures enhance the ability to capture topological information of the metric while preserving computational efficiency. However, at the core of TSW-SL, the splitting maps, which serve as the mechanism for pushing forward measures onto tree systems, focus solely on the position of the measure supports while disregarding the projecting domains. Moreover, the specific splitting map used in TSW-SL leads to a metric that is not invariant under Euclidean transformations, a typically expected property for OT on Euclidean space. In this work, we propose a novel class of splitting maps that generalizes the existing one studied in TSW-SL enabling the use of all positional information from input measures, resulting in a novel Distance-based Tree-Sliced Wasserstein (Db-TSW) distance. In addition, we introduce a simple tree sampling process better suited for Db-TSW, leading to an efficient GPU-friendly implementation for tree systems, similar to the original SW. We also provide a comprehensive theoretical analysis of proposed class of splitting maps to verify the injectivity of the corresponding Radon Transform, and demonstrate that Db-TSW is an Euclidean invariant metric. We empirically show that Db-TSW significantly improves accuracy compared to recent SW variants while maintaining low computational cost via a wide range of experiments.

Paper Structure

This paper contains 53 sections, 4 theorems, 70 equations, 8 figures, 5 tables, 1 algorithm.

Key Result

Proposition 3.1

The $2$-Wasserstein distance and the Sliced $p$-Wasserstein distance are $\operatorname{E}(d)$-invariant. In other words, for every $\mu,\nu \in \mathcal{P}(\mathbb{R}^d)$ and $g \in \operatorname{E}(d)$, we have

Figures (8)

  • Figure 1: An illustration highlighting the distinction between the old and new Radon Transform on Systems of Lines, specifically focusing on two different definitions of splitting maps. Left: The old splitting map relies solely on the location of points, leading to the same distribution and independent of the position of the line systems. Right: The new splitting map considers the configuration of systems of lines, leading to varied mass distributions depending on each system.
  • Figure 2: An illustration demonstrating $\operatorname{E}(d)$-invariance of splitting maps. Starting with a point and a system of lines, two Euclidean transformations are applied, resulting in two additional pairs of points and systems of lines. An $\operatorname{E}(d)$-invariant handles all three pairs identically, leading to the same mass distribution from the point to lines within each system.
  • Figure 3: An illustration of constructing a tree system: Starting with a bunch of lines with no structure (left), we consider the intersections of all pairs of lines (middle), then removing some of the intersections to obtain a tree system (Right). There exists a unique path between any two points, since we only allows the pass through the remained intersections.
  • Figure 4: Logarithm of Wasserstein Distance over 3 runs on Gaussian 20d dataset.
  • Figure 5: Comparison of color transferred image.
  • ...and 3 more figures

Theorems & Definitions (22)

  • Remark 1
  • Proposition 3.1
  • Proposition 3.2
  • Remark 2
  • Definition 4.1: Radon Transform on Systems of Lines
  • Remark 3
  • Definition 4.2
  • Theorem 4.3
  • Remark 4
  • Definition 5.1: Distance-Based Tree-Sliced Wasserstein Distance
  • ...and 12 more