Table of Contents
Fetching ...

HiMAP: Hilbert Mass-Aligned Parameterization for Multivariate Barycenters and Frećhet Regression

Tao Wang, Qiannan Huang, Jun Zhu, Cheng Meng

TL;DR

HiMAP is proposed, a Hilbert mass-aligned parameterization that endows multivariate measures with a distribution-invariant notion of quantile level and delivers barycenters and regression fits comparable to standard optimal-transport surrogates while achieving substantial speedups in schemes dominated by repeated barycenter evaluations.

Abstract

Many learning tasks represent responses as multivariate probability measures, requiring repeated computation of weighted barycenters in Wasserstein space. In multivariate settings, transport barycenters are often computationally demanding and, more importantly, are generally not well posed under the affine weight schemes inherent to global and local Frećhet regression, where weights sum to one but may be negative. We propose HiMAP, a Hilbert mass-aligned parameterization that endows multivariate measures with a distribution-invariant notion of quantile level. The construction recursively refines the domain through equiprobable conditional-median splits and follows a Hilbert curve ordering, so that a single scalar index consistently tracks cumulative probability mass across distributions. This yields an embedding into a Hilbert function space and induces a tractable discrepancy for distribution comparison and averaging. Crucially, the representation is closed under affine averaging, leading to a closed-form, well-posed barycenter and an explicit distribution-valued Frećhet regression estimator obtained by averaging HiMAP quantile maps. We establish consistency and a dimension-dependent polynomial convergence rate for HiMAP estimators under mild conditions, matching the classical rates for empirical convergence in multivariate Wasserstein geometry. Numerical experiments and a multivariate climate-indicator study demonstrate that HiMAP delivers barycenters and regression fits comparable to standard optimal-transport surrogates while achieving substantial speedups in schemes dominated by repeated barycenter evaluations.

HiMAP: Hilbert Mass-Aligned Parameterization for Multivariate Barycenters and Frećhet Regression

TL;DR

HiMAP is proposed, a Hilbert mass-aligned parameterization that endows multivariate measures with a distribution-invariant notion of quantile level and delivers barycenters and regression fits comparable to standard optimal-transport surrogates while achieving substantial speedups in schemes dominated by repeated barycenter evaluations.

Abstract

Many learning tasks represent responses as multivariate probability measures, requiring repeated computation of weighted barycenters in Wasserstein space. In multivariate settings, transport barycenters are often computationally demanding and, more importantly, are generally not well posed under the affine weight schemes inherent to global and local Frećhet regression, where weights sum to one but may be negative. We propose HiMAP, a Hilbert mass-aligned parameterization that endows multivariate measures with a distribution-invariant notion of quantile level. The construction recursively refines the domain through equiprobable conditional-median splits and follows a Hilbert curve ordering, so that a single scalar index consistently tracks cumulative probability mass across distributions. This yields an embedding into a Hilbert function space and induces a tractable discrepancy for distribution comparison and averaging. Crucially, the representation is closed under affine averaging, leading to a closed-form, well-posed barycenter and an explicit distribution-valued Frećhet regression estimator obtained by averaging HiMAP quantile maps. We establish consistency and a dimension-dependent polynomial convergence rate for HiMAP estimators under mild conditions, matching the classical rates for empirical convergence in multivariate Wasserstein geometry. Numerical experiments and a multivariate climate-indicator study demonstrate that HiMAP delivers barycenters and regression fits comparable to standard optimal-transport surrogates while achieving substantial speedups in schemes dominated by repeated barycenter evaluations.
Paper Structure (38 sections, 11 theorems, 198 equations, 14 figures, 4 tables, 2 algorithms)

This paper contains 38 sections, 11 theorems, 198 equations, 14 figures, 4 tables, 2 algorithms.

Key Result

Lemma 4

Under Assumption assu:1, the diameter of the path cells shrinks exponentially. Specifically, let $\rho=1-\frac{m}{2M} \in (0,1)$, there exists $C>0$ such that for all $L$ and $t$,

Figures (14)

  • Figure 1: Illustration of the first three orders of the Hilbert curve construction.
  • Figure 2: Comparison of construction policies. (a) Geometry-driven approach: partitions space based on a fixed spatial metric, sensitive to support boundaries. (b) Mass-aligned approach (Ours): recursively splits cells by conditional medians, adapting to the underlying density. The legend in (b) denotes splits along the logical axes; note that in the lower-left and lower-right quadrants, the Round 3 logical $x$-split maps to a horizontal geometric split due to Hilbert coordinate reflections.
  • Figure 3: Barycenter estimation procedure. For each sample $U_{i,n_i}$, we estimate the $\texttt{HiMAP}{}$ quantile function via Algorithm \ref{['alg:tree']} and \ref{['alg:evaluate']}. (b) Computing the barycenter is then equivalent to taking a weighted sum of these quantile functions, i.e., performing a weighted aggregation of the discrete support points at each shared quantile level. The resulting red point cloud represents the estimated $\texttt{HiMAP}{}$ barycenter distribution.
  • Figure 4: Barycentric interpolation between two clustered measures. Columns correspond to weights $[1,0]$, $[3/4,1/4]$, $[1/2,1/2]$, $[1/4,3/4]$, and $[0,1]$. Rows show the $\texttt{HiMAP}$ Barycenter (top), Sliced WB (middle), and Sinkhorn WB (bottom).
  • Figure 5: Concentric ellipses benchmark with $30$ input measures (left block). We compare the $\texttt{HiMAP}$ Barycenter, Sliced WB, and Sinkhorn WB. Reported are the barycenter point clouds (top row within the table), the runtime (seconds), and the average unregularized OT cost to the inputs under uniform weights.
  • ...and 9 more figures

Theorems & Definitions (17)

  • Remark 1
  • Remark 2: Local Logical Coordinates and Reflections
  • Lemma 4: Geometric Decay of Cell Diameters
  • Lemma 5: Existence of Limiting Map
  • Definition 6
  • Theorem 7: Pushforward Identification
  • Proposition 8
  • Theorem 9: Pointwise Convergence
  • Corollary 10: $L^2$ Convergence
  • Theorem 11: Existence, Uniqueness, and Closed Form
  • ...and 7 more