Table of Contents
Fetching ...

Extending Characterizations of Multivariate Laws via Distance Distributions

Annika Betken, Aljosa Marjanovic, Katharina Proksch

TL;DR

This work broadens the identifiability framework for multivariate distributions by replacing the rigid homogeneity assumption with analytic conditions on distance-induced balls, namely volume-regularity, Lebesgue differentiability with uniformly bounded centered oscillations, and controlled density oscillations. It proves a generalized identifiability theorem: if i.i.d. samples X_i ~ f and Y_j ~ g have densities with the stated regularity, then equality of within-sample and between-sample interpoint distance distributions forces f and g to coincide, with corollaries recovering Maa's original theorem via monotone transformations and extending to compact Riemannian manifolds. The paper also provides quantitative L^2 bounds linking the density difference to Kolmogorov discrepancies of distance distributions, yielding dimension-aware rates under Ahlfors alpha-regularity and Hölder continuity of the densities. Through concrete examples—including Canberra, entropic, and Bray–Curtis distances—it demonstrates both the applicability and boundaries of the theory, offering a principled foundation for distance-based two-sample testing beyond translation-invariant geometries.

Abstract

We extend a theorem of Maa, Pearl, and Bartoszynski, which links equality of interpoint distance distributions to equality of underlying multivariate distributions, beyond the restrictive class of homogeneous, translation-invariant distance functions. Our approach replaces geometric assumptions on the distance with analytic conditions: volume-regularity of distance-induced balls, Lebesgue differentiability with respect to the distance, and bounded centered oscillations of densities. Under these conditions, equality of interpoint distance distributions continues to imply equality of the generating laws. The result persists under monotone continuous transformations of homogeneous, translation-invariant distances, recovering the original statement, and it extends to compact Riemannian manifolds equipped with the geodesic metric. We further develop a quantitative version of the theorem, i.e., inequalities that connect discrepancies of interpoint distance distributions to the $L^2$-distance between densities, and obtain explicit rates under Ahlfors $α$-regularity of the distance function and $β$-Hölder continuity of densities, capturing dependence on dimensionality. Several representative examples illustrate the applicability of the generalization to domain-specific distances used in modern statistics. The examples include non-homogeneous non-translation invariant distances such as Canberra, entropic distances, and the Bray--Curtis dissimilarity.

Extending Characterizations of Multivariate Laws via Distance Distributions

TL;DR

This work broadens the identifiability framework for multivariate distributions by replacing the rigid homogeneity assumption with analytic conditions on distance-induced balls, namely volume-regularity, Lebesgue differentiability with uniformly bounded centered oscillations, and controlled density oscillations. It proves a generalized identifiability theorem: if i.i.d. samples X_i ~ f and Y_j ~ g have densities with the stated regularity, then equality of within-sample and between-sample interpoint distance distributions forces f and g to coincide, with corollaries recovering Maa's original theorem via monotone transformations and extending to compact Riemannian manifolds. The paper also provides quantitative L^2 bounds linking the density difference to Kolmogorov discrepancies of distance distributions, yielding dimension-aware rates under Ahlfors alpha-regularity and Hölder continuity of the densities. Through concrete examples—including Canberra, entropic, and Bray–Curtis distances—it demonstrates both the applicability and boundaries of the theory, offering a principled foundation for distance-based two-sample testing beyond translation-invariant geometries.

Abstract

We extend a theorem of Maa, Pearl, and Bartoszynski, which links equality of interpoint distance distributions to equality of underlying multivariate distributions, beyond the restrictive class of homogeneous, translation-invariant distance functions. Our approach replaces geometric assumptions on the distance with analytic conditions: volume-regularity of distance-induced balls, Lebesgue differentiability with respect to the distance, and bounded centered oscillations of densities. Under these conditions, equality of interpoint distance distributions continues to imply equality of the generating laws. The result persists under monotone continuous transformations of homogeneous, translation-invariant distances, recovering the original statement, and it extends to compact Riemannian manifolds equipped with the geodesic metric. We further develop a quantitative version of the theorem, i.e., inequalities that connect discrepancies of interpoint distance distributions to the -distance between densities, and obtain explicit rates under Ahlfors -regularity of the distance function and -Hölder continuity of densities, capturing dependence on dimensionality. Several representative examples illustrate the applicability of the generalization to domain-specific distances used in modern statistics. The examples include non-homogeneous non-translation invariant distances such as Canberra, entropic distances, and the Bray--Curtis dissimilarity.

Paper Structure

This paper contains 4 sections, 7 theorems, 100 equations, 2 figures.

Key Result

Theorem 1

Let $X_1, X_2, X_3$ be i.i.d. random vectors with values in $\mathbb{R}^k$ and Lebesgue probability density $f_X$, let $Y_1, Y_2, Y_3$ be i.i.d. random vectors with values in $\mathbb{R}^k$ and Lebesgue probability density $f_Y$ and let $X_1, X_2,X_3$ and $Y_1, Y_2, Y_3$ be independent. Let $d:\math Moreover, assume that Then, it holds that

Figures (2)

  • Figure 1: Comparison of the volumes of one-dimensional balls in Canberra and Euclidean distances. The plots represent the volumes $\Phi(x,t)$ as level sets by colorscheme, where $x \in (0,10)$ and $t \in (0, 0.1)$. Left Canberra, right Euclidean balls. For a fixed radius $t$, the volume of an Euclidean ball deos not depend on its position in space, while in Canberra distance balls become larger the further their center from the origin.
  • Figure 2: Behavior of two-dimensional balls in Canberra distance. The figures are generated by computing Canberra distances on a dense grid of points and coloring those inside a ball of a given radius. Upper row: unit balls, with centers $(10, 1)$, $(10, 10)$, $(5, 5)$ and $(50, 10)$. Lower row: balls of radius $0.8$, $0.6$, $0.4$ and $0.2$ centered at $(10, 10)$.

Theorems & Definitions (20)

  • Theorem 1: cf. Theorem 2 in Maa96
  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Theorem 2
  • Definition 5
  • Theorem 3
  • Remark 1
  • Remark 2
  • ...and 10 more