Table of Contents
Fetching ...

Optimal Transport for Measures with Noisy Tree Metric

Tam Le, Truyen Nguyen, Kenji Fukumizu

TL;DR

This work proposes novel uncertainty sets of tree metrics from the lens of edge deletion/addition which covers a diversity of tree structures in an elegant framework and demonstrates that the robust OT satisfies the metric property and is negative definite.

Abstract

We study optimal transport (OT) problem for probability measures supported on a tree metric space. It is known that such OT problem (i.e., tree-Wasserstein (TW)) admits a closed-form expression, but depends fundamentally on the underlying tree structure over supports of input measures. In practice, the given tree structure may be, however, perturbed due to noisy or adversarial measurements. To mitigate this issue, we follow the max-min robust OT approach which considers the maximal possible distances between two input measures over an uncertainty set of tree metrics. In general, this approach is hard to compute, even for measures supported in one-dimensional space, due to its non-convexity and non-smoothness which hinders its practical applications, especially for large-scale settings. In this work, we propose novel uncertainty sets of tree metrics from the lens of edge deletion/addition which covers a diversity of tree structures in an elegant framework. Consequently, by building upon the proposed uncertainty sets, and leveraging the tree structure over supports, we show that the robust OT also admits a closed-form expression for a fast computation as its counterpart standard OT (i.e., TW). Furthermore, we demonstrate that the robust OT satisfies the metric property and is negative definite. We then exploit its negative definiteness to propose positive definite kernels and test them in several simulations on various real-world datasets on document classification and topological data analysis.

Optimal Transport for Measures with Noisy Tree Metric

TL;DR

This work proposes novel uncertainty sets of tree metrics from the lens of edge deletion/addition which covers a diversity of tree structures in an elegant framework and demonstrates that the robust OT satisfies the metric property and is negative definite.

Abstract

We study optimal transport (OT) problem for probability measures supported on a tree metric space. It is known that such OT problem (i.e., tree-Wasserstein (TW)) admits a closed-form expression, but depends fundamentally on the underlying tree structure over supports of input measures. In practice, the given tree structure may be, however, perturbed due to noisy or adversarial measurements. To mitigate this issue, we follow the max-min robust OT approach which considers the maximal possible distances between two input measures over an uncertainty set of tree metrics. In general, this approach is hard to compute, even for measures supported in one-dimensional space, due to its non-convexity and non-smoothness which hinders its practical applications, especially for large-scale settings. In this work, we propose novel uncertainty sets of tree metrics from the lens of edge deletion/addition which covers a diversity of tree structures in an elegant framework. Consequently, by building upon the proposed uncertainty sets, and leveraging the tree structure over supports, we show that the robust OT also admits a closed-form expression for a fast computation as its counterpart standard OT (i.e., TW). Furthermore, we demonstrate that the robust OT satisfies the metric property and is negative definite. We then exploit its negative definiteness to propose positive definite kernels and test them in several simulations on various real-world datasets on document classification and topological data analysis.
Paper Structure (65 sections, 6 theorems, 50 equations, 33 figures, 1 table)

This paper contains 65 sections, 6 theorems, 50 equations, 33 figures, 1 table.

Key Result

Theorem 3.1

Given tree ${\mathcal{T}}$, denote $V$ as the set of vertices of ${\mathcal{T}}$. Let ${\mathcal{T}}'$ be a tree constructed from ${\mathcal{T}}$ by collapsing its $0$-length edge, i.e., merging two corresponding vertices for an edge $e$ in ${\mathcal{T}}$ with $w_e = 0$. Consequently, for any measu To simplify the notations, we also write ${\mathcal{W}}_{{\mathcal{T}}'}(\mu, \nu)$ for ${\mathcal{

Figures (33)

  • Figure 1: An illustration of transforming a tree structure to another under the lens of edge addition/deletion. Given the binary tree structure ${\mathcal{T}}_1$, if we collapse edge $e_1$ by merging vertex $v_1$ into the root vertex $r$ in ${\mathcal{T}}_1$, we obtain the ternary tree structure ${\mathcal{T}}_2$. Additionally, in tree ${\mathcal{T}}_1$, if we duplicate vertex $v_1$ into $\{v_a, v_1 \}$, connect these two nodes by edge $e_a$; and add the vertex $v_b$ with edge $e_b$ between $v_b$ and root $r$, then we obtain the ternary tree structure ${\mathcal{T}}_3$.
  • Figure 2: SVM results and time consumption for kernel matrices in document classification. For each dataset, the numbers in the parenthesis are the number of classes; the number of documents; and the maximum number of unique words for each document respectively.
  • Figure 3: SVM results and time consumption for kernel matrices in TDA. For each dataset, the numbers in the parenthesis are respectively the number of PD; and the maximum number of points in PD.
  • Figure 4: SVM results for document classification w.r.t. the radius $\lambda$.
  • Figure 5: SVM results for TDA w.r.t. the radius $\lambda$.
  • ...and 28 more figures

Theorems & Definitions (14)

  • Theorem 3.1
  • Example 3.2: Edge deletion for tree metric
  • Example 3.3: Edge addition for tree metric
  • Proposition 3.4
  • Proposition 3.5: Connection between two approaches
  • Theorem 3.6: Negative definiteness
  • Proposition 3.7: Infinitely divisible kernels
  • Proposition 3.8: Metric
  • proof
  • proof
  • ...and 4 more