Tight Bounds for Jensen's Gap with Applications to Variational Inference

Marcin Mazur; Tadeusz Dziarmaga; Piotr Kościelniak; Łukasz Struski

Tight Bounds for Jensen's Gap with Applications to Variational Inference

Marcin Mazur, Tadeusz Dziarmaga, Piotr Kościelniak, Łukasz Struski

TL;DR

This work develops general, higher-order bounds on Jensen's gap $JG(f,X)$, with a focus on exponential and logarithmic convex functions, backed by analytical results and empirical validation. It generalizes existing Taylor- and moment-based bounds, providing explicit formulas and coefficients that tighten bounds via $2k$-th order expansions and, in the log case, gamma and lognormal distributions as key examples. The authors integrate these bounds into the PAC-Bayes framework to yield data-dependent generalization insights and demonstrate a practical log-likelihood estimation procedure for variational models, validated on real-world data. Overall, the approach yields tighter, more informative bounds for variational inference and probabilistic modeling, while outlining limitations and societal considerations.

Abstract

Since its original formulation, Jensen's inequality has played a fundamental role across mathematics, statistics, and machine learning, with its probabilistic version highlighting the nonnegativity of the so-called Jensen's gap, i.e., the difference between the expectation of a convex function and the function at the expectation. Of particular importance is the case when the function is logarithmic, as this setting underpins many applications in variational inference, where the term variational gap is often used interchangeably. Recent research has focused on estimating the size of Jensen's gap and establishing tight lower and upper bounds under various assumptions on the underlying function and distribution, driven by practical challenges such as the intractability of log-likelihood in graphical models like variational autoencoders (VAEs). In this paper, we propose new, general bounds for Jensen's gap that accommodate a broad range of assumptions on both the function and the random variable, with special attention to exponential and logarithmic cases. We provide both analytical and empirical evidence for the performance of our method. Furthermore, we relate our bounds to the PAC-Bayes framework, providing new insights into generalization performance in probabilistic models.

Tight Bounds for Jensen's Gap with Applications to Variational Inference

TL;DR

Abstract

Tight Bounds for Jensen's Gap with Applications to Variational Inference

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (7)