Complexity Analysis of Normalizing Constant Estimation: from Jarzynski Equality to Annealed Importance Sampling and beyond
Wei Guo, Molei Tao, Yongxin Chen
TL;DR
This work provides a non-asymptotic complexity analysis for estimating normalizing constants $Z$ of unnormalized densities $\pi\propto e^{-V}$ using annealing-based methods such as the Jarzynski equality and annealed importance sampling, introducing an action-based framework that avoids strong isoperimetric assumptions. The authors derive a concrete oracle complexity bound $\widetilde{O}\left(\frac{d\beta^2\mathcal{A}^2}{\varepsilon^4}\right)$ tied to the curve’s action $\mathcal{A}$ and show a JE-based time bound $T=\mathcal{O}(\mathcal{A}/\varepsilon^2)$ to achieve $\varepsilon$-relative accuracy with high probability. Building on these insights, they establish a first non-asymptotic AIS complexity bound (with a geometric interpolation) and demonstrate that large actions hinder AIS performance, motivating a diffusion-based alternative via reverse diffusion samplers (RDS) with tractable action bounds. The paper also provides a framework and empirical evidence showing RDS can substantially improve multimodal sampling and normalizing constant estimation over AIS in challenging settings. Overall, the results offer finite-sample guarantees and practical algorithms for estimating partition functions in high-dimensional, multimodal landscapes without strong log-concavity assumptions, with broad implications for Bayesian model evidence, free-energy computations, and energy-based modeling.
Abstract
Given an unnormalized probability density $π\propto\mathrm{e}^{-V}$, estimating its normalizing constant $Z=\int_{\mathbb{R}^d}\mathrm{e}^{-V(x)}\mathrm{d}x$ or free energy $F=-\log Z$ is a crucial problem in Bayesian statistics, statistical mechanics, and machine learning. It is challenging especially in high dimensions or when $π$ is multimodal. To mitigate the high variance of conventional importance sampling estimators, annealing-based methods such as Jarzynski equality and annealed importance sampling are commonly adopted, yet their quantitative complexity guarantees remain largely unexplored. We take a first step toward a non-asymptotic analysis of annealed importance sampling. In particular, we derive an oracle complexity of $\widetilde{O}\left(\frac{dβ^2{\mathcal{A}}^2}{\varepsilon^4}\right)$ for estimating $Z$ within $\varepsilon$ relative error with high probability, where $β$ is the smoothness of $V$ and $\mathcal{A}$ denotes the action of a curve of probability measures interpolating $π$ and a tractable reference distribution. Our analysis, leveraging Girsanov theorem and optimal transport, does not explicitly require isoperimetric assumptions on the target distribution. Finally, to tackle the large action of the widely used geometric interpolation, we propose a new algorithm based on reverse diffusion samplers, establish a framework for analyzing its complexity, and empirically demonstrate its efficiency in tackling multimodality.
