Table of Contents
Fetching ...

SRRM: Improving Recursive Transport Surrogates in the Small-Discrepancy Regime

Yufei Zhang, Tao Wang, Jingyi Zhang

Abstract

Recursive partitioning methods provide computationally efficient surrogates for the Wasserstein distance, yet their statistical behavior and their resolution in the small-discrepancy regime remain insufficiently understood. We study Recursive Rank Matching (RRM) as a representative instance of this class under a population-anchored reference. In this setting, we establish consistency and an explicit convergence rate for the anchored empirical RRM under the quadratic cost. We then identify a dominant mismatch mechanism responsible for the loss of resolution in the small-discrepancy regime. Based on this analysis, we introduce Selective Recursive Rank Matching (SRRM), which suppresses the resulting dominant mismatches and yields a higher-fidelity practical surrogate for the Wasserstein distance at moderate additional computational cost.

SRRM: Improving Recursive Transport Surrogates in the Small-Discrepancy Regime

Abstract

Recursive partitioning methods provide computationally efficient surrogates for the Wasserstein distance, yet their statistical behavior and their resolution in the small-discrepancy regime remain insufficiently understood. We study Recursive Rank Matching (RRM) as a representative instance of this class under a population-anchored reference. In this setting, we establish consistency and an explicit convergence rate for the anchored empirical RRM under the quadratic cost. We then identify a dominant mismatch mechanism responsible for the loss of resolution in the small-discrepancy regime. Based on this analysis, we introduce Selective Recursive Rank Matching (SRRM), which suppresses the resulting dominant mismatches and yields a higher-fidelity practical surrogate for the Wasserstein distance at moderate additional computational cost.
Paper Structure (29 sections, 7 theorems, 103 equations, 20 figures, 2 tables, 2 algorithms)

This paper contains 29 sections, 7 theorems, 103 equations, 20 figures, 2 tables, 2 algorithms.

Key Result

Lemma 3.3

Let $T_\mu:[0,1]\to\mathbb{R}^d$ in Definition lem:conditional-holder1. Then $T_\mu$ is Borel measurable and satisfies $(T_\mu)_\# \mathrm{Unif}[0,1] = \mu.$

Figures (20)

  • Figure 1.1: The last-mile phenomenon of recursive partitioning methods.
  • Figure 2.1: (a) Samples of the source and target distributions. (b) Illustrations of how various distance metrics change as $\alpha$ increases. (c) Comparisons of different distance metrics in terms of runtime.
  • Figure 2.2: Hilbert space-filling curve Visualization.
  • Figure 2.3: Samples of the distributions and illustrations of how various distance metrics change.
  • Figure 2.4: Samples of the last-mile problem.
  • ...and 15 more figures

Theorems & Definitions (19)

  • Remark 3.1
  • Definition 3.2: Mass-median axis-recursive RRM tree curve and induced transport map
  • Lemma 3.3: Pushforward property of the induced map
  • Definition 3.4
  • Theorem 3.5
  • Lemma 3.7: Global Hölder control
  • Theorem 3.8: Consistency and rate for the anchored empirical RRM
  • Corollary 3.9: Two-sample stability of $\mathrm{RRM}$
  • Theorem 3.10: Finite-depth consistency of the empirical mass-median tree
  • Definition 4.1: Premature splitting under depth-$H$ address restriction
  • ...and 9 more