Glivenko-Cantelli for $f$-divergence

Haoming Wang; Lek-Heng Lim

Glivenko-Cantelli for $f$-divergence

Haoming Wang, Lek-Heng Lim

TL;DR

This work extends the Glivenko--Cantelli framework to general $f$-divergences by introducing an $f$-divergence over the ray class $\mathcal{R}$, addressing the challenge that $\mathcal{R}$ is not a σ-algebra. The core idea is to leverage a Radon--Nikodym-type property and projected densities $\mathrm{proj}_{G(\mathcal{R})}(d\mu/d\nu)$ to define $\mathop{\mathrm{D}}_f^{\mathcal{R}}(\mu \Vert \nu)$, which recovers the Kolmogorov--Smirnov distance for $f(t)=\frac{|t-1|}{2}$ and the standard $f$-divergence when extended to $\mathcal{B}$. The paper proves linearity, nonnegativity, affine invariance, and a basic identity for these divergences, and establishes Glivenko--Cantelli theorems for $\mathcal{R}$-divergences (convergence of $\mathop{\mathrm{D}}_f^{\mathcal{R}}(\nu_n \Vert \nu)$ and $\mathop{\mathrm{D}}_f^{\mathcal{R}}(\nu \Vert \nu_n)$ to zero a.s.). It also outlines a preliminary Vapnik--Chervonenkis theory for $f$-divergence via pre-Glivenko--Cantelli classes and discusses the limitations of Choquet-integral approaches in this setting. Overall, the work opens a path to robust statistical guarantees for a wide class of divergences beyond total variation and KS, with potential impact on learning theory and empirical process methods.

Abstract

We extend the celebrated Glivenko-Cantelli theorem, sometimes called the fundamental theorem of statistics, from its standard setting of total variation distance to all $f$-divergences. A key obstacle in this endeavor is to define $f$-divergence on a subcollection of a $σ$-algebra that forms a $π$-system but not a $σ$-subalgebra. This is a side contribution of our work. We will show that this notion of $f$-divergence on the $π$-system of rays preserves nearly all known properties of standard $f$-divergence, yields a novel integral representation of the Kolmogorov-Smirnov distance, and has a Glivenko-Cantelli theorem. We will also discuss the prospects of a Vapnik-Chervonenkis theory for $f$-divergence.

Glivenko-Cantelli for $f$-divergence

TL;DR

This work extends the Glivenko--Cantelli framework to general

-divergences by introducing an

-divergence over the ray class

, addressing the challenge that

is not a σ-algebra. The core idea is to leverage a Radon--Nikodym-type property and projected densities

to define

, which recovers the Kolmogorov--Smirnov distance for

and the standard

-divergence when extended to

. The paper proves linearity, nonnegativity, affine invariance, and a basic identity for these divergences, and establishes Glivenko--Cantelli theorems for

-divergences (convergence of

and

to zero a.s.). It also outlines a preliminary Vapnik--Chervonenkis theory for

-divergence via pre-Glivenko--Cantelli classes and discusses the limitations of Choquet-integral approaches in this setting. Overall, the work opens a path to robust statistical guarantees for a wide class of divergences beyond total variation and KS, with potential impact on learning theory and empirical process methods.

Abstract

We extend the celebrated Glivenko-Cantelli theorem, sometimes called the fundamental theorem of statistics, from its standard setting of total variation distance to all

-divergences. A key obstacle in this endeavor is to define

-divergence on a subcollection of a

-algebra that forms a

-system but not a

-subalgebra. This is a side contribution of our work. We will show that this notion of

-divergence on the

-system of rays preserves nearly all known properties of standard

-divergence, yields a novel integral representation of the Kolmogorov-Smirnov distance, and has a Glivenko-Cantelli theorem. We will also discuss the prospects of a Vapnik-Chervonenkis theory for

-divergence.

Glivenko-Cantelli for $f$-divergence

TL;DR

Abstract

Glivenko-Cantelli for $f$-divergence

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (70)