Fisher-Rao Gradient Flow: Geodesic Convexity and Functional Inequalities

José A. Carrillo; Yifan Chen; Daniel Zhengyu Huang; Jiaoyang Huang; Dongyi Wei

Fisher-Rao Gradient Flow: Geodesic Convexity and Functional Inequalities

José A. Carrillo, Yifan Chen, Daniel Zhengyu Huang, Jiaoyang Huang, Dongyi Wei

TL;DR

The paper studies Fisher-Rao gradient flows for general $f$-divergences, aiming to derive convergence guarantees that are uniform across target distributions. It shows that the KL divergence is not geodesically convex and does not satisfy gradient dominance in the Fisher-Rao geometry, motivating conditions on $f$ under which geodesic convexity and gradient dominance hold, including a three-point reduction via linear programming. A novel dual gradient-dominance framework is introduced, with explicit results for reverse $\chi^2$ and $*$-conjugate divergences, yielding exponential convergence bounds for the combined energy under well-posed dynamics. The results highlight a robust pathway to uniform convergence of FR-gradient flows irrespective of the target density, with potential impact on posterior sampling and Bayesian inference. The work sets up future directions on well-posedness, non-smooth energies, and efficient numerical schemes for FR-gradient-based sampling.

Abstract

The dynamics of probability density functions has been extensively studied in science and engineering to understand physical phenomena and facilitate algorithmic design. Of particular interest are dynamics that can be formulated as gradient flows of energy functionals under the Wasserstein metric. The development of functional inequalities, such as the log-Sobolev inequality, plays a pivotal role in analyzing the convergence of these dynamics. The goal of this paper is to parallel the success of techniques using functional inequalities, for dynamics that are gradient flows under the Fisher-Rao metric, with various $f$-divergences as energy functionals. Such dynamics take the form of a nonlocal differential equation, for which existing analysis critically relies on using the explicit solution formula in special cases. We provide a comprehensive study on functional inequalities and the relevant geodesic convexity for Fisher-Rao gradient flows under minimal assumptions. A notable feature of the obtained functional inequalities is that they do not depend on the log-concavity or log-Sobolev constants of the target distribution. Consequently, the convergence rate of the dynamics (assuming well-posed) is uniform across general target distributions, making them potentially desirable dynamics for posterior sampling applications in Bayesian inference.

Fisher-Rao Gradient Flow: Geodesic Convexity and Functional Inequalities

TL;DR

The paper studies Fisher-Rao gradient flows for general

-divergences, aiming to derive convergence guarantees that are uniform across target distributions. It shows that the KL divergence is not geodesically convex and does not satisfy gradient dominance in the Fisher-Rao geometry, motivating conditions on

under which geodesic convexity and gradient dominance hold, including a three-point reduction via linear programming. A novel dual gradient-dominance framework is introduced, with explicit results for reverse

and

-conjugate divergences, yielding exponential convergence bounds for the combined energy under well-posed dynamics. The results highlight a robust pathway to uniform convergence of FR-gradient flows irrespective of the target density, with potential impact on posterior sampling and Bayesian inference. The work sets up future directions on well-posedness, non-smooth energies, and efficient numerical schemes for FR-gradient-based sampling.

Abstract

-divergences as energy functionals. Such dynamics take the form of a nonlocal differential equation, for which existing analysis critically relies on using the explicit solution formula in special cases. We provide a comprehensive study on functional inequalities and the relevant geodesic convexity for Fisher-Rao gradient flows under minimal assumptions. A notable feature of the obtained functional inequalities is that they do not depend on the log-concavity or log-Sobolev constants of the target distribution. Consequently, the convergence rate of the dynamics (assuming well-posed) is uniform across general target distributions, making them potentially desirable dynamics for posterior sampling applications in Bayesian inference.

Paper Structure (28 sections, 15 theorems, 170 equations)

This paper contains 28 sections, 15 theorems, 170 equations.

Introduction
Wasserstein gradient flows: Functional inequalities and convexity
Analysis of Fisher-Rao gradient flows
Main results
Related works
Displacement convexity and functional inequalities
Fisher-Rao gradient flow
Organization of this paper
Fisher-Rao Gradient Flows of $f$-divergences
Geodesic Convexity in the Fisher-Rao Geometry
Preliminaries on convex analysis
Hessian operator in the Fisher-Rao geometry
Geodesic convexity in the Fisher-Rao geometry
Functional Inequality: Gradient Dominance Condition
KL divergence: no gradient dominance
...and 13 more sections

Key Result

Theorem 1.1

Under the Fisher-Rao geometry, the KL divergence is not geodesically convex for any ${\rho^{*}}$, even within a small neighborhood of ${\rho^{*}}$. Moreover, the gradient dominance condition is not satisfied for any ${\rho^{*}}$, also within any small neighborhood of ${\rho^{*}}$. Here the "small ne

Theorems & Definitions (33)

Theorem 1.1: informal
Theorem 1.2: informal
Theorem 1.3: informal
Remark 3.1
Remark 3.2
Theorem 3.3
proof
Theorem 3.4
proof
Theorem 4.1
...and 23 more

Fisher-Rao Gradient Flow: Geodesic Convexity and Functional Inequalities

TL;DR

Abstract

Fisher-Rao Gradient Flow: Geodesic Convexity and Functional Inequalities

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (33)