Table of Contents
Fetching ...

Codivergences and information matrices

Alexis Derumigny, Johannes Schmidt-Hieber

TL;DR

This work introduces codivergence as a local, bilinear notion of angle between probability measures around a reference $P_0$, enabling directional comparison of $P_1$ and $P_2$. It defines covariance-type $V_\phi$ and correlation-type $R_\phi$ codivergences, with central instances including the $χ^2$-codivergence and the Hellinger codivergence, and shows these induce divergence matrices that closely resemble Gram matrices of a tangent-like space. The authors derive explicit $R_α$ expressions for common parametric families—multivariate normal, Poisson, Bernoulli, and Gamma—and establish a data-processing inequality for the $χ^2$-divergence matrix, highlighting robustness under Markov kernels. They also analyze rank properties and connect the local codivergence structure to the nonparametric Fisher information metric, offering a versatile framework for lower bounds and information-geometry-inspired analysis in both finite and infinite-dimensional settings.

Abstract

We propose a new concept of codivergence, which quantifies the similarity between two probability measures $P_1, P_2$ relative to a reference probability measure $P_0$. In the neighborhood of the reference measure $P_0$, a codivergence behaves like an inner product between the measures $P_1 - P_0$ and $P_2 - P_0$. Codivergences of covariance-type and correlation-type are introduced and studied with a focus on two specific correlation-type codivergences, the $χ^2$-codivergence and the Hellinger codivergence. We derive explicit expressions for several common parametric families of probability distributions. For a codivergence, we introduce moreover the divergence matrix as an analogue of the Gram matrix. It is shown that the $χ^2$-divergence matrix satisfies a data-processing inequality.

Codivergences and information matrices

TL;DR

This work introduces codivergence as a local, bilinear notion of angle between probability measures around a reference , enabling directional comparison of and . It defines covariance-type and correlation-type codivergences, with central instances including the -codivergence and the Hellinger codivergence, and shows these induce divergence matrices that closely resemble Gram matrices of a tangent-like space. The authors derive explicit expressions for common parametric families—multivariate normal, Poisson, Bernoulli, and Gamma—and establish a data-processing inequality for the -divergence matrix, highlighting robustness under Markov kernels. They also analyze rank properties and connect the local codivergence structure to the nonparametric Fisher information metric, offering a versatile framework for lower bounds and information-geometry-inspired analysis in both finite and infinite-dimensional settings.

Abstract

We propose a new concept of codivergence, which quantifies the similarity between two probability measures relative to a reference probability measure . In the neighborhood of the reference measure , a codivergence behaves like an inner product between the measures and . Codivergences of covariance-type and correlation-type are introduced and studied with a focus on two specific correlation-type codivergences, the -codivergence and the Hellinger codivergence. We derive explicit expressions for several common parametric families of probability distributions. For a codivergence, we introduce moreover the divergence matrix as an analogue of the Gram matrix. It is shown that the -divergence matrix satisfies a data-processing inequality.
Paper Structure (18 sections, 86 equations, 1 figure, 1 table)

This paper contains 18 sections, 86 equations, 1 figure, 1 table.

Figures (1)

  • Figure 1: The codivergence between $P_1$ and $P_2$ at $P_0$ measures the position of $P_1$ and $P_2$ relative to $P_0.$

Theorems & Definitions (16)

  • proof : Proof of Proposition \ref{['prop:biggest_tangent_space']}
  • proof : Proof of Proposition \ref{['prop:div_matrices']}
  • proof
  • proof
  • proof : Proof of Theorem \ref{['thm.data_processing']}
  • proof : Simpler proof of Theorem \ref{['thm.data_processing']} under the additional assumption \ref{['eq:assump:common_domination']}
  • proof
  • proof
  • proof
  • proof
  • ...and 6 more