Table of Contents
Fetching ...

Finite sample bounds for barycenter estimation in geodesic spaces

Victor-Emmanuel Brunel, Jordan Serres

TL;DR

The paper develops non-asymptotic, finite-sample guarantees for estimating barycenters of distributions in geodesic spaces with an upper curvature bound. By extending concentration tools via Laplace transforms to CAT$(\kappa)$ spaces, it derives dimension-free bounds for empirical and iterated barycenters, including both expectation and high-probability results that generalize Hoeffding- and Bernstein-type inequalities. It then presents two algorithmic applications: a fast stochastic approximation in CAT$(0)$ spaces and a parallelized barycenter estimation scheme in symmetric spaces, each with PAC guarantees. Finally, the Riemannian case is treated to yield refined, geometry-aware bounds under sub-Gaussian conditions on tangent-space projections, highlighting the practical impact for non-Euclidean data analysis and statistical inference in curved spaces.

Abstract

We study the problem of estimating the barycenter of a distribution given i.i.d. data in a geodesic space. Assuming an upper curvature bound in Alexandrov's sense and a support condition ensuring the strong geodesic convexity of the barycenter problem, we establish finite-sample error bounds in expectation and with high probability. Our results generalize Hoeffding- and Bernstein-type concentration inequalities from Euclidean to geodesic spaces. Building on these concentration inequalities, we derive statistical guarantees for two efficient algorithms for the computation of barycenters.

Finite sample bounds for barycenter estimation in geodesic spaces

TL;DR

The paper develops non-asymptotic, finite-sample guarantees for estimating barycenters of distributions in geodesic spaces with an upper curvature bound. By extending concentration tools via Laplace transforms to CAT spaces, it derives dimension-free bounds for empirical and iterated barycenters, including both expectation and high-probability results that generalize Hoeffding- and Bernstein-type inequalities. It then presents two algorithmic applications: a fast stochastic approximation in CAT spaces and a parallelized barycenter estimation scheme in symmetric spaces, each with PAC guarantees. Finally, the Riemannian case is treated to yield refined, geometry-aware bounds under sub-Gaussian conditions on tangent-space projections, highlighting the practical impact for non-Euclidean data analysis and statistical inference in curved spaces.

Abstract

We study the problem of estimating the barycenter of a distribution given i.i.d. data in a geodesic space. Assuming an upper curvature bound in Alexandrov's sense and a support condition ensuring the strong geodesic convexity of the barycenter problem, we establish finite-sample error bounds in expectation and with high probability. Our results generalize Hoeffding- and Bernstein-type concentration inequalities from Euclidean to geodesic spaces. Building on these concentration inequalities, we derive statistical guarantees for two efficient algorithms for the computation of barycenters.

Paper Structure

This paper contains 23 sections, 37 theorems, 73 equations, 1 figure.

Key Result

Proposition 1

Let $(M,\mathop{}\!\mathrm{d})$ be a $\textrm{CAT}(\kappa)$ space for some $\kappa\in\mathbb R$. Then, for all $x,y\in M$ with $\mathop{}\!\mathrm{d}(x,y)<D_\kappa$, there is a unique geodesic from $x$ to $y$.

Figures (1)

  • Figure 1: Barycenter on a metric tree ($n=3p$): Here, the iterated barycenter $\tilde{b}_n$ of $x_1,\ldots,x_n$ does not get any close to $b_n$ no matter how large $n$ is, if $x_1,\ldots,x_n$ are taken in this order.

Theorems & Definitions (64)

  • Definition 1
  • Definition 2
  • Definition 3
  • Proposition 1
  • Lemma 1
  • Definition 4
  • Lemma 2: Metric projection onto a convex domain
  • Lemma 3
  • proof
  • Proposition 2: Variance inequality
  • ...and 54 more