Hilbert Curve Projection Distance for Distribution Comparison
Tao Li, Cheng Meng, Hongteng Xu, Jun Yu
TL;DR
This work introduces the Hilbert Curve Projection (HCP) distance as a scalable surrogate for the Wasserstein distance to compare probability distributions with bounded supports. By projecting distributions along a Hilbert space-filling curve and computing a coupling in the original space, the authors prove that $\mathrm{HCP}_p$ is a metric and satisfies $\mathrm{W}_p(\mu,\nu)\leq \mathrm{HCP}_p(\mu,\nu)$, providing a strong theoretical foundation. To address the curse of dimensionality, two variants are proposed: integral projection robust HCP (IPRHCP) and projection robust HCP (PRHCP), along with convergence guarantees and efficient algorithms. Empirical results on synthetic and real data show HCP closely tracks Wasserstein with far lower cost, while PRHCP delivers robust performance in high dimensions, making the approach attractive for tasks in data classification and generative modeling.
Abstract
Distribution comparison plays a central role in many machine learning tasks like data classification and generative modeling. In this study, we propose a novel metric, called Hilbert curve projection (HCP) distance, to measure the distance between two probability distributions with low complexity. In particular, we first project two high-dimensional probability distributions using Hilbert curve to obtain a coupling between them, and then calculate the transport distance between these two distributions in the original space, according to the coupling. We show that HCP distance is a proper metric and is well-defined for probability measures with bounded supports. Furthermore, we demonstrate that the modified empirical HCP distance with the $L_p$ cost in the $d$-dimensional space converges to its population counterpart at a rate of no more than $O(n^{-1/2\max\{d,p\}})$. To suppress the curse-of-dimensionality, we also develop two variants of the HCP distance using (learnable) subspace projections. Experiments on both synthetic and real-world data show that our HCP distance works as an effective surrogate of the Wasserstein distance with low complexity and overcomes the drawbacks of the sliced Wasserstein distance.
