Table of Contents
Fetching ...

Parameter estimation for Gibbs distributions

David G. Harris, Vladimir Kolmogorov

TL;DR

This work develops black-box methods to estimate counts c_x and partition functions Z for Gibbs distributions through two complementary frameworks: a continuous setting using cooling schedules and the Paired Product Estimator (PPE), and an integer/log-concave setting using covering schedules. It proves near-optimal sample complexities: for general Gibbs, counts require ~\tilde{O}(q/ε^2) samples, while in the integer-valued case the cost is ~\tilde{O}(n^2/ε^2); partition-function estimation matches these scales. The paper also provides concrete improvements for combinatorial counting problems (connected subgraphs, matchings, independent sets) via FPRAS-like results, and establishes lower bounds showing tightness in several regimes. A key contribution is the unification of sampling-to-counting techniques across continuous and discrete settings, with practical algorithms for estimating Q(β) and then recovering full μ_β and c_x with provable guarantees. The methods have potential broad impact on physics-inspired counting, graph combinatorics, and sampling-based estimation where the partition function and density of states are central.

Abstract

We consider Gibbs distributions, which are families of probability distributions over a discrete space $Ω$ with probability mass function of the form $μ^Ω_β(ω) \propto e^{βH(ω)}$ for $β$ in an interval $[β_{\min}, β_{\max}]$ and $H( ω) \in \{0 \} \cup [1, n]$. The partition function is the normalization factor $Z(β)=\sum_{ω\inΩ}e^{βH(ω)}$. Two important parameters of these distributions are the log partition ratio $q = \log \tfrac{Z(β_{\max})}{Z(β_{\min})}$ and the counts $c_x = |H^{-1}(x)|$. These are correlated with system parameters in a number of physical applications and sampling algorithms. Our first main result is to estimate the counts $c_x$ using roughly $\tilde O( \frac{q}{\varepsilon^2})$ samples for general Gibbs distributions and $\tilde O( \frac{n^2}{\varepsilon^2} )$ samples for integer-valued distributions (ignoring some second-order terms and parameters), and we show this is optimal up to logarithmic factors. We illustrate with improved algorithms for counting connected subgraphs, independent sets, and perfect matchings. As a key subroutine, we also develop algorithms to compute the partition function $Z$ using $\tilde O(\frac{q}{\varepsilon^2})$ samples for general Gibbs distributions and using $\tilde O(\frac{n^2}{\varepsilon^2})$ samples for integer-valued distributions.

Parameter estimation for Gibbs distributions

TL;DR

This work develops black-box methods to estimate counts c_x and partition functions Z for Gibbs distributions through two complementary frameworks: a continuous setting using cooling schedules and the Paired Product Estimator (PPE), and an integer/log-concave setting using covering schedules. It proves near-optimal sample complexities: for general Gibbs, counts require ~\tilde{O}(q/ε^2) samples, while in the integer-valued case the cost is ~\tilde{O}(n^2/ε^2); partition-function estimation matches these scales. The paper also provides concrete improvements for combinatorial counting problems (connected subgraphs, matchings, independent sets) via FPRAS-like results, and establishes lower bounds showing tightness in several regimes. A key contribution is the unification of sampling-to-counting techniques across continuous and discrete settings, with practical algorithms for estimating Q(β) and then recovering full μ_β and c_x with provable guarantees. The methods have potential broad impact on physics-inspired counting, graph combinatorics, and sampling-based estimation where the partition function and density of states are central.

Abstract

We consider Gibbs distributions, which are families of probability distributions over a discrete space with probability mass function of the form for in an interval and . The partition function is the normalization factor . Two important parameters of these distributions are the log partition ratio and the counts . These are correlated with system parameters in a number of physical applications and sampling algorithms. Our first main result is to estimate the counts using roughly samples for general Gibbs distributions and samples for integer-valued distributions (ignoring some second-order terms and parameters), and we show this is optimal up to logarithmic factors. We illustrate with improved algorithms for counting connected subgraphs, independent sets, and perfect matchings. As a key subroutine, we also develop algorithms to compute the partition function using samples for general Gibbs distributions and using samples for integer-valued distributions.

Paper Structure

This paper contains 35 sections, 57 theorems, 94 equations, 12 algorithms.

Key Result

Theorem 1

${P^{\delta, \varepsilon}_{\tt count}}$ can be solved with the following complexities: where recall that cost refers to the expected number of queries to the oracle.

Theorems & Definitions (94)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • Theorem 6
  • Theorem 7
  • Theorem 8
  • Theorem 9
  • Theorem 10
  • ...and 84 more