Convergence of coordinate ascent variational inference for log-concave measures via optimal transport
Manuel Arnese, Daniel Lacker
TL;DR
This work provides the first general convergence theory for Coordinate Ascent Variational Inference (CAVI) in MFVI when the target is log-concave, showing that the MFVI objective is geodesically convex in Wasserstein space and that CAVI acts as a block coordinate descent on that geometry. Under mild integrability, the iterates converge to a minimizer; a strictly convex potential yields a unique minimizer and weak convergence of the iterates. With a Lipschitz gradient for the log-density, the paper proves a linear convergence rate, and with strong convexity, exponential convergence, both characterized in terms of the Wasserstein diameter $R$ and problem constants. The Gaussian special case exhibits a dimension-free exponential rate, illustrating the sharpness of the theory in practice. Overall, the results bridge optimal-transport geometry and classical convex optimization to deliver practical convergence guarantees for MFVI via CAVI.
Abstract
Mean field variational inference (VI) is the problem of finding the closest product (factorized) measure, in the sense of relative entropy, to a given high-dimensional probability measure $ρ$. The well known Coordinate Ascent Variational Inference (CAVI) algorithm aims to approximate this product measure by iteratively optimizing over one coordinate (factor) at a time, which can be done explicitly. Despite its popularity, the convergence of CAVI remains poorly understood. In this paper, we prove the convergence of CAVI for log-concave densities $ρ$. If additionally $\log ρ$ has Lipschitz gradient, we find a linear rate of convergence, and if also $ρ$ is strongly log-concave, we find an exponential rate. Our analysis starts from the observation that mean field VI, while notoriously non-convex in the usual sense, is in fact displacement convex in the sense of optimal transport when $ρ$ is log-concave. This allows us to adapt techniques from the optimization literature on coordinate descent algorithms in Euclidean space.
