Table of Contents
Fetching ...

Partition function approach to non-Gaussian likelihoods: information theory and state variables for Bayesian inference

Rebecca Maria Kuntz, Heinrich von Campe, Tobias Röspel, Maximilian Philipp Herzog, Björn Malte Schäfer

TL;DR

This work builds a partition-function framework that treats Bayesian inference as a thermodynamics-like process by introducing a Bayes partition function $Z[T,J]$ and an invariant volume form from information geometry, enabling a thermodynamic interpretation of sampling in parameter space. It introduces a continuous Bayes update parameter $\lambda$ and a Jarzynski-type relation for the information gained during updating, linking update work to the likelihood potential; it then develops a Gaussian Bayes partition with explicit expressions for energy, entropy, and heat capacity, and generalizes to a grand ensemble with variable sampler number. The authors connect Bayes partitions to KL and Rényi entropies, define an effective dimension $n_{\mathrm{eff}}$ to quantify non-Gaussian complexity, and illustrate the framework with a cosmology application to SN Ia data, showing how heating (increasing $T$) reduces non-Gaussian impacts and recovers Bayesian evidence at $T=1$. Overall, the paper provides a unifying information-thermodynamics perspective on inference, with practical implications for experimental design, model comparison, and understanding sampler dynamics across Gaussian and non-Gaussian regimes.

Abstract

The significance of statistical physics concepts such as entropy extends far beyond classical thermodynamics. We interpret the similarity between partitions in statistical mechanics and partitions in Bayesian inference as an articulation of a result by Jaynes (1957), who clarified that thermodynamics is in essence a theory of information. In this, every sampling process has a mechanical analogue. Consequently, the divide between ensembles of samplers in parameter space and sampling from a mechanical system in thermodynamic equilibrium would be artificial. Based on this realisation, we construct a continuous modelling of a Bayes update akin to a transition between thermodynamic ensembles. This leads to an information theoretic interpretation of Jazinsky's equality, relating the expenditure of work to the influence of data via the likelihood. We propose one way to transfer the vocabulary and the formalism of thermodynamics (energy, work, heat) and statistical mechanics (partition functions) to statistical inference, starting from Bayes' law. Different kinds of inference processes are discussed and relative entropies are shown to follow from suitably constructed partitions as an analytical formulation of sampling processes. Lastly, we propose an effective dimension as a measure of system complexity. A numerical example from cosmology is put forward to illustrate these results.

Partition function approach to non-Gaussian likelihoods: information theory and state variables for Bayesian inference

TL;DR

This work builds a partition-function framework that treats Bayesian inference as a thermodynamics-like process by introducing a Bayes partition function and an invariant volume form from information geometry, enabling a thermodynamic interpretation of sampling in parameter space. It introduces a continuous Bayes update parameter and a Jarzynski-type relation for the information gained during updating, linking update work to the likelihood potential; it then develops a Gaussian Bayes partition with explicit expressions for energy, entropy, and heat capacity, and generalizes to a grand ensemble with variable sampler number. The authors connect Bayes partitions to KL and Rényi entropies, define an effective dimension to quantify non-Gaussian complexity, and illustrate the framework with a cosmology application to SN Ia data, showing how heating (increasing ) reduces non-Gaussian impacts and recovers Bayesian evidence at . Overall, the paper provides a unifying information-thermodynamics perspective on inference, with practical implications for experimental design, model comparison, and understanding sampler dynamics across Gaussian and non-Gaussian regimes.

Abstract

The significance of statistical physics concepts such as entropy extends far beyond classical thermodynamics. We interpret the similarity between partitions in statistical mechanics and partitions in Bayesian inference as an articulation of a result by Jaynes (1957), who clarified that thermodynamics is in essence a theory of information. In this, every sampling process has a mechanical analogue. Consequently, the divide between ensembles of samplers in parameter space and sampling from a mechanical system in thermodynamic equilibrium would be artificial. Based on this realisation, we construct a continuous modelling of a Bayes update akin to a transition between thermodynamic ensembles. This leads to an information theoretic interpretation of Jazinsky's equality, relating the expenditure of work to the influence of data via the likelihood. We propose one way to transfer the vocabulary and the formalism of thermodynamics (energy, work, heat) and statistical mechanics (partition functions) to statistical inference, starting from Bayes' law. Different kinds of inference processes are discussed and relative entropies are shown to follow from suitably constructed partitions as an analytical formulation of sampling processes. Lastly, we propose an effective dimension as a measure of system complexity. A numerical example from cosmology is put forward to illustrate these results.

Paper Structure

This paper contains 13 sections, 80 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: The Guggenheim scheme of standard thermodynamics (at fixed $N$), on the left, and one possibility to construct the analogous Guggenheim scheme of information theory for a Gaussian likelihood (at fixed $n$ and $N$), on the right. Intensive variables in blue, extensive variables in red.
  • Figure 2: Schematic changes of a univariate Gaussian (shaded region) and its corresponding potential (solid line), as given in equation \ref{['eq: Gaussian partition']}, with different values of $T$ and $J$. For simplicity, $F = 1$ is fixed.
  • Figure 3: Decrease of $|n-n_\mathrm{eff}|$ with temperature $T$ and even order of non-Gaussianity $k$. Here, the scaling is $T^{-k/2-1} \frac{(k+2)}{(k-1)!}$.
  • Figure 4: The Supernova posterior distribution of the matter density $\Omega_m$ and dark energy equation of state parameter $w_0$. The $2\sigma$-contour of the Gaussian approximation deviates from the true sampled posterior (blue).
  • Figure 5: The generating function/ potential $\ln Z[T]$ of the Supernova posterior is given as a function of temperature. The value for $T = 1$ is marked by a vertical line, indicating the value of $\ln Z[T]$ with the Bayesian evidence $Z[T=1]$.
  • ...and 1 more figures