Parameter estimation for Gibbs distributions
David G. Harris, Vladimir Kolmogorov
TL;DR
This work develops black-box methods to estimate counts c_x and partition functions Z for Gibbs distributions through two complementary frameworks: a continuous setting using cooling schedules and the Paired Product Estimator (PPE), and an integer/log-concave setting using covering schedules. It proves near-optimal sample complexities: for general Gibbs, counts require ~\tilde{O}(q/ε^2) samples, while in the integer-valued case the cost is ~\tilde{O}(n^2/ε^2); partition-function estimation matches these scales. The paper also provides concrete improvements for combinatorial counting problems (connected subgraphs, matchings, independent sets) via FPRAS-like results, and establishes lower bounds showing tightness in several regimes. A key contribution is the unification of sampling-to-counting techniques across continuous and discrete settings, with practical algorithms for estimating Q(β) and then recovering full μ_β and c_x with provable guarantees. The methods have potential broad impact on physics-inspired counting, graph combinatorics, and sampling-based estimation where the partition function and density of states are central.
Abstract
We consider Gibbs distributions, which are families of probability distributions over a discrete space $Ω$ with probability mass function of the form $μ^Ω_β(ω) \propto e^{βH(ω)}$ for $β$ in an interval $[β_{\min}, β_{\max}]$ and $H( ω) \in \{0 \} \cup [1, n]$. The partition function is the normalization factor $Z(β)=\sum_{ω\inΩ}e^{βH(ω)}$. Two important parameters of these distributions are the log partition ratio $q = \log \tfrac{Z(β_{\max})}{Z(β_{\min})}$ and the counts $c_x = |H^{-1}(x)|$. These are correlated with system parameters in a number of physical applications and sampling algorithms. Our first main result is to estimate the counts $c_x$ using roughly $\tilde O( \frac{q}{\varepsilon^2})$ samples for general Gibbs distributions and $\tilde O( \frac{n^2}{\varepsilon^2} )$ samples for integer-valued distributions (ignoring some second-order terms and parameters), and we show this is optimal up to logarithmic factors. We illustrate with improved algorithms for counting connected subgraphs, independent sets, and perfect matchings. As a key subroutine, we also develop algorithms to compute the partition function $Z$ using $\tilde O(\frac{q}{\varepsilon^2})$ samples for general Gibbs distributions and using $\tilde O(\frac{n^2}{\varepsilon^2})$ samples for integer-valued distributions.
