Table of Contents
Fetching ...

Hellinger-Kantorovich Gradient Flows: Global Exponential Decay of Entropy Functionals

Alexander Mielke, Jia-Jie Zhu

TL;DR

This paper develops the Hellinger-Kantorovich (HK) gradient-flow framework, unifying transport (Otto–Wasserstein) and birth–death (Fisher–Rao) dynamics for positive and probability measures. It generalizes entropy energies to ${\varphi_p}$-divergences and analyzes gradient flows across HK, SHK, and OT geometries using Polyak–Łojasiewicz-type inequalities, highlighting when global convergence can be guaranteed. A key contribution is the shape–mass decomposition, which enables global exponential decay results for HK gradient flows driven by the KL energy despite the absence of a global logarithmic Sobolev inequality on ${\mathcal{M}}^+$. The results provide a unified theoretical framework with implications for computational methods in statistical inference, optimization, and machine learning, including explicit decay rates and Lyapunov structures across multiple geometries.

Abstract

We investigate a family of gradient flows of positive and probability measures, focusing on the Hellinger-Kantorovich (HK) geometry, which unifies transport mechanism of Otto-Wasserstein, and the birth-death mechanism of Hellinger (or Fisher-Rao). A central contribution is a complete characterization of global exponential decay behaviors of entropy functionals (e.g. KL, $χ^2$) under Otto-Wasserstein and Hellinger-type gradient flows. In particular, for the more challenging analysis of HK gradient flows on positive measures -- where the typical log-Sobolev arguments fail -- we develop a specialized shape-mass decomposition that enables new analysis results. Our approach also leverages the (Polyak-)Łojasiewicz-type functional inequalities and a careful extension of classical dissipation estimates. These findings provide a unified and complete theoretical framework for gradient flows and underpin applications in computational algorithms for statistical inference, optimization, and machine learning.

Hellinger-Kantorovich Gradient Flows: Global Exponential Decay of Entropy Functionals

TL;DR

This paper develops the Hellinger-Kantorovich (HK) gradient-flow framework, unifying transport (Otto–Wasserstein) and birth–death (Fisher–Rao) dynamics for positive and probability measures. It generalizes entropy energies to -divergences and analyzes gradient flows across HK, SHK, and OT geometries using Polyak–Łojasiewicz-type inequalities, highlighting when global convergence can be guaranteed. A key contribution is the shape–mass decomposition, which enables global exponential decay results for HK gradient flows driven by the KL energy despite the absence of a global logarithmic Sobolev inequality on . The results provide a unified theoretical framework with implications for computational methods in statistical inference, optimization, and machine learning, including explicit decay rates and Lyapunov structures across multiple geometries.

Abstract

We investigate a family of gradient flows of positive and probability measures, focusing on the Hellinger-Kantorovich (HK) geometry, which unifies transport mechanism of Otto-Wasserstein, and the birth-death mechanism of Hellinger (or Fisher-Rao). A central contribution is a complete characterization of global exponential decay behaviors of entropy functionals (e.g. KL, ) under Otto-Wasserstein and Hellinger-type gradient flows. In particular, for the more challenging analysis of HK gradient flows on positive measures -- where the typical log-Sobolev arguments fail -- we develop a specialized shape-mass decomposition that enables new analysis results. Our approach also leverages the (Polyak-)Łojasiewicz-type functional inequalities and a careful extension of classical dissipation estimates. These findings provide a unified and complete theoretical framework for gradient flows and underpin applications in computational algorithms for statistical inference, optimization, and machine learning.

Paper Structure

This paper contains 19 sections, 24 theorems, 140 equations, 7 figures, 3 tables.

Key Result

Theorem 2.3

LiMiSa16OTCR The Hellinger-Kantorovich distance over positive measures $\mathcal{M}^+$ has the equivalent characterization as the optimal value of the logarithmic-entropy-transport (LET) problem where functional $\Psi$ is the (scaled) KL divergence $\Psi(u|v):= \frac{1}{\beta}\,{\mathrm D}_{\textrm{KL}}(u|v)$ and the transport cost is

Figures (7)

  • Figure 1: The plot illustrates the power-like entropy generator functions ${\varphi_p}(s)$ for $s\in [0,1.2]$ and different $p$: purple $p=0$ (forward KL), green $p=0.25$, blue $p=0.5$ (Hellinger), red $p=1$ (KL), orange $p=2$ ($\chi^2$). The large red dot represents the equilibrium at $s=1$ where ${\varphi_p}(1)={\varphi_p}'(1)=0$.
  • Figure 2: The two figures illustrate the conceptual advantage of combining the Otto-Wasserstein and the Hellinger gradient flows. On the left, the particles are transported by the gradient descent enabled by the Otto-Wasserstein gradient flow, where masses do not change. On the right, the dashed arrow represent the "teleportation" of mass enabled by the Hellinger gradient flow, where the positions do not change. The size of the dots represents the amount of mass of the particles.
  • Figure 3: The plot illustrates the lack of global Łojasiewicz inequality as in Lemma \ref{['lm:local-vs-global-Loj']}. We plot the KL-entropy generator function $\varphi(s) = s\log s -s + 1$. The blue dotted curve represents the KL-entropy generator $\varphi(s)$. The function $s|\log s |^2$ is plotted in solid black. The Łojasiewicz inequality condition is satisfied locally around the equilibrium $s = 1$ (red dot). However, it can never be satisfied in a neighborhood around $s=0$.
  • Figure 4: Illustration of Example \ref{['ex:escape-zero']}: birth escaping (near) zero with initial densities $\mu_0$ (red) and target densities $\pi$ (blue, Gaussian).
  • Figure 5: The plot illustrates the left-hand side $s({\varphi_p}'(s))^2$ of the Łojasiewicz inequality \ref{['eq:FR.Loja.Cond']} for the Hellinger geometry for $s\in [0,1.2]$ and different $p$: purple $p=2$ ($\chi^2$), red $p=1$ (KL), green $p=0.5$ (Hellinger), orange $p=0.25$, blue $p=0$ (forward KL), The red dot represents the equilibrium at $s=1$, where $\phi'(s)=0$. This plot provides insights into the slopes of the power-like entropies in the Hellinger gradient flow. Indeed, Proposition \ref{['prop:loj-phi']} discusses the relation of the corresponding curves ${\varphi_p}(s)$ in Figure \ref{['fig:power-ents']} with those here. We observe the threshold $p=0.5$ (Hellinger; green) where the behavior near $s=0$ jumps. See the main text, especially Remark \ref{['rem:power-threshold']}, for analysis.
  • ...and 2 more figures

Theorems & Definitions (38)

  • Definition 2.1: Gradient system
  • Example 2.1: Classical PDE: Allen-Cahn and Cahn-Hilliard
  • Example 2.2: Otto-Wasserstein geodesics in Hamiltonian formulation
  • Example 2.3: Hellinger geodesics in Hamiltonian formulation
  • Remark 2.2: "Hellinger" versus "Fisher-Rao"
  • Theorem 2.3: Logarithmic-Entropy-Transport definition of $\mathsf H\!\!\mathsf K$
  • Example 2.4: Reaction-diffusion PDE
  • Proposition 2.4: Explicit formula for $\mathsf S\!\mathsf H\!\!\mathsf K_{\alpha,\beta}$
  • Definition 3.1: Polyak-Łojasiewicz inequality for generalized gradient systems
  • Theorem 3.2: Functional inequality for pure Otto-Wasserstein: ${\mathbb R}^d$
  • ...and 28 more