Table of Contents
Fetching ...

Fast proxy centers for Jeffreys centroids: The Jeffreys-Fisher-Rao and the inductive Gauss-Bregman centers

Frank Nielsen

TL;DR

This paper proposes the new Jeffreys–Fisher–Rao center defined as the Fisher–Rao midpoint of the sided Kullback–Leibler centroids as a plug-in replacement of the Jeffreys centroid and defines a new type of inductive center generalizing the principle of the Gauss arithmetic–geometric double sequence mean for pairs of densities of any given exponential family.

Abstract

The symmetric Kullback-Leibler centroid also called the Jeffreys centroid of a set of mutually absolutely continuous probability distributions on a measure space provides a notion of centrality which has proven useful in many tasks including information retrieval, information fusion, and clustering in image, video and sound processing. However, the Jeffreys centroid is not available in closed-form for sets of categorical or normal distributions, two widely used statistical models, and thus need to be approximated numerically in practice. In this paper, we first propose the new Jeffreys-Fisher-Rao center defined as the Fisher-Rao midpoint of the sided Kullback-Leibler centroids as a plug-in replacement of the Jeffreys centroid. This Jeffreys-Fisher-Rao center admits a generic formula for uni-parameter exponential family distributions, and closed-form formula for categorical and normal distributions, matches exactly the Jeffreys centroid for same-mean normal distributions, and is experimentally observed in practice to be close to the Jeffreys centroid. Second, we define a new type of inductive centers generalizing the principle of Gauss arithmetic-geometric double sequence mean for pairs of densities of any given exponential family. This center is shown experimentally to approximate very well the Jeffreys centroid and is suggested to use when the Jeffreys-Fisher-Rao center is not available in closed form. Moreover, this Gauss-Bregman inductive center always converges and matches the Jeffreys centroid for sets of same-mean normal distributions. We report on our experiments demonstrating the use of the Jeffreys-Fisher-Rao and Gauss-Bregman centers instead of the Jeffreys centroid. Finally, we conclude this work by reinterpreting these fast proxy centers of Jeffreys centroids under the lens of dually flat spaces in information geometry.

Fast proxy centers for Jeffreys centroids: The Jeffreys-Fisher-Rao and the inductive Gauss-Bregman centers

TL;DR

This paper proposes the new Jeffreys–Fisher–Rao center defined as the Fisher–Rao midpoint of the sided Kullback–Leibler centroids as a plug-in replacement of the Jeffreys centroid and defines a new type of inductive center generalizing the principle of the Gauss arithmetic–geometric double sequence mean for pairs of densities of any given exponential family.

Abstract

The symmetric Kullback-Leibler centroid also called the Jeffreys centroid of a set of mutually absolutely continuous probability distributions on a measure space provides a notion of centrality which has proven useful in many tasks including information retrieval, information fusion, and clustering in image, video and sound processing. However, the Jeffreys centroid is not available in closed-form for sets of categorical or normal distributions, two widely used statistical models, and thus need to be approximated numerically in practice. In this paper, we first propose the new Jeffreys-Fisher-Rao center defined as the Fisher-Rao midpoint of the sided Kullback-Leibler centroids as a plug-in replacement of the Jeffreys centroid. This Jeffreys-Fisher-Rao center admits a generic formula for uni-parameter exponential family distributions, and closed-form formula for categorical and normal distributions, matches exactly the Jeffreys centroid for same-mean normal distributions, and is experimentally observed in practice to be close to the Jeffreys centroid. Second, we define a new type of inductive centers generalizing the principle of Gauss arithmetic-geometric double sequence mean for pairs of densities of any given exponential family. This center is shown experimentally to approximate very well the Jeffreys centroid and is suggested to use when the Jeffreys-Fisher-Rao center is not available in closed form. Moreover, this Gauss-Bregman inductive center always converges and matches the Jeffreys centroid for sets of same-mean normal distributions. We report on our experiments demonstrating the use of the Jeffreys-Fisher-Rao and Gauss-Bregman centers instead of the Jeffreys centroid. Finally, we conclude this work by reinterpreting these fast proxy centers of Jeffreys centroids under the lens of dually flat spaces in information geometry.

Paper Structure

This paper contains 20 sections, 8 theorems, 69 equations, 10 figures, 2 tables, 3 algorithms.

Key Result

Theorem 1

The Jeffreys centroid of a set of $n$ categorical distributions parameterized by $\mathcal{P}=\{p_1,\ldots,p_n\}\in\Delta_d$ arranged in a matrix $P=[p_{i,j}]\in\mathbb{R}^{n\times d}$ and weighted by a vector $w=(w_1,\ldots,w_n)\in\Delta^n$ is $c(\lambda)=(c_1(\lambda),\ldots,c_d(\lambda))$ with where $a_j=\sum_{i=1}^n w_ip_{i,j}$ and $g_j=\frac{\prod_{i=1}^n p_{i,j}^{w_i}}{\sum_{j=1}^d \prod_{i

Figures (10)

  • Figure 1: Application of centroids and centers in signal processing. Left: information fusion and mixture model simplification, a Gaussian mixture model is simplified to a single normal distribution. Right: distributed estimation, a dataset is split among $p$ processus $P_i$'s which first estimate the statistical model parameters $\hat{\theta}_i$'s. Then the processus models are aggregated to yield a single consolidated model $\hat{\theta}$.
  • Figure 2: Visualizing the arithmetic, normalized geometric, numerical Jeffreys, Jeffreys-Fisher-Rao, and Gauss-Bregman centroids/centers in red, blue, green, purple and yellow, respectively. Left: Input set consists of $n=32$ trinomial distributions with parameters chosen randomly. Right: Input set consists of two trinomial distributions with parameters $(\frac{1}{2},\frac{1}{2})$ and $(0.99,0.005,0.005)$. The numerical Jeffreys centroid (green) is time consuming to calculate using the Lambert $W$ function. However, the Jeffreys centroid can be well approximated by either the Jeffreys-Fisher-Rao center (purple) or the inductive Gauss-Bregman center (yellow). Point centers are visualize with different radii in order to distinguish them easily.
  • Figure 3: Left: Displaying the arithmetic, normalized geometric, numerical Jeffreys, Jeffreys-Fisher-Rao, and Gauss-Bregman centroids/centers in red, blue, green, purple and yellow, respectively. Input set are two normalized histograms with $d=256$ bins plotted as polylines with $255$ line segments. Observe that the Jeffreys-Fisher-Rao center (purple) and Gauss-Bregman center (yellow) approximates well the Jeffreys centroid (green) which is more computationally expensive to calculate. Right: Closed-up window on the first left bins of normalized histograms.
  • Figure 4: Geometric illustration of the double sequence inducing a Gauss-Bregman center in the limit.
  • Figure 5: Illustration of the double sequence convergence for scalar Gauss-Bregman $(A,m_{\nabla F})$-mean.
  • ...and 5 more figures

Theorems & Definitions (13)

  • Theorem 1: Categorical Jeffreys centroid nielsen2013jeffreys
  • Definition 1: Quasi-arithmetic center
  • Theorem 2: moakher2006symmetric
  • Definition 2: Jeffreys-Fisher-Rao (JFR) center
  • Definition 3: Gauss-Bregman $(A,\nabla F)$-center
  • Theorem 3
  • Theorem 4: Jeffreys-Fisher-Rao centroid in uni-order exponential families
  • Theorem 5: JFR centroid of categorical distributions
  • Theorem 6: JFR center of MVNs
  • Remark 1
  • ...and 3 more