Codivergences and information matrices

Alexis Derumigny; Johannes Schmidt-Hieber

Codivergences and information matrices

Alexis Derumigny, Johannes Schmidt-Hieber

TL;DR

This work introduces codivergence as a local, bilinear notion of angle between probability measures around a reference $P_0$, enabling directional comparison of $P_1$ and $P_2$. It defines covariance-type $V_\phi$ and correlation-type $R_\phi$ codivergences, with central instances including the $χ^2$-codivergence and the Hellinger codivergence, and shows these induce divergence matrices that closely resemble Gram matrices of a tangent-like space. The authors derive explicit $R_α$ expressions for common parametric families—multivariate normal, Poisson, Bernoulli, and Gamma—and establish a data-processing inequality for the $χ^2$-divergence matrix, highlighting robustness under Markov kernels. They also analyze rank properties and connect the local codivergence structure to the nonparametric Fisher information metric, offering a versatile framework for lower bounds and information-geometry-inspired analysis in both finite and infinite-dimensional settings.

Abstract

We propose a new concept of codivergence, which quantifies the similarity between two probability measures $P_1, P_2$ relative to a reference probability measure $P_0$. In the neighborhood of the reference measure $P_0$, a codivergence behaves like an inner product between the measures $P_1 - P_0$ and $P_2 - P_0$. Codivergences of covariance-type and correlation-type are introduced and studied with a focus on two specific correlation-type codivergences, the $χ^2$-codivergence and the Hellinger codivergence. We derive explicit expressions for several common parametric families of probability distributions. For a codivergence, we introduce moreover the divergence matrix as an analogue of the Gram matrix. It is shown that the $χ^2$-divergence matrix satisfies a data-processing inequality.

Codivergences and information matrices

TL;DR

This work introduces codivergence as a local, bilinear notion of angle between probability measures around a reference

, enabling directional comparison of

and

. It defines covariance-type

and correlation-type

codivergences, with central instances including the

-codivergence and the Hellinger codivergence, and shows these induce divergence matrices that closely resemble Gram matrices of a tangent-like space. The authors derive explicit

expressions for common parametric families—multivariate normal, Poisson, Bernoulli, and Gamma—and establish a data-processing inequality for the

-divergence matrix, highlighting robustness under Markov kernels. They also analyze rank properties and connect the local codivergence structure to the nonparametric Fisher information metric, offering a versatile framework for lower bounds and information-geometry-inspired analysis in both finite and infinite-dimensional settings.

Abstract

We propose a new concept of codivergence, which quantifies the similarity between two probability measures

relative to a reference probability measure

. In the neighborhood of the reference measure

, a codivergence behaves like an inner product between the measures

and

. Codivergences of covariance-type and correlation-type are introduced and studied with a focus on two specific correlation-type codivergences, the

-codivergence and the Hellinger codivergence. We derive explicit expressions for several common parametric families of probability distributions. For a codivergence, we introduce moreover the divergence matrix as an analogue of the Gram matrix. It is shown that the

-divergence matrix satisfies a data-processing inequality.

Paper Structure (18 sections, 86 equations, 1 figure, 1 table)

This paper contains 18 sections, 86 equations, 1 figure, 1 table.

Introduction
Codivergences
Abstract framework and definition
Codivergences on the space of probability measures
Examples of codivergences
Divergence matrices
Data processing inequality for the chi2-divergence matrix
Derivations for explicit expressions for the R_alpha codivergence
Multivariate normal distribution
Poisson distribution
Bernoulli distribution
Gamma distribution
Facts about ranks
Conclusion
Proofs
...and 3 more sections

Figures (1)

Figure 1: The codivergence between $P_1$ and $P_2$ at $P_0$ measures the position of $P_1$ and $P_2$ relative to $P_0.$

Theorems & Definitions (16)

proof : Proof of Proposition \ref{['prop:biggest_tangent_space']}
proof : Proof of Proposition \ref{['prop:div_matrices']}
proof
proof
proof : Proof of Theorem \ref{['thm.data_processing']}
proof : Simpler proof of Theorem \ref{['thm.data_processing']} under the additional assumption \ref{['eq:assump:common_domination']}
proof
proof
proof
proof
...and 6 more

Codivergences and information matrices

TL;DR

Abstract

Codivergences and information matrices

Authors

TL;DR

Abstract

Table of Contents

Figures (1)

Theorems & Definitions (16)