Thermodynamic Response Functions in Singular Bayesian Models

Sean Plummer

Thermodynamic Response Functions in Singular Bayesian Models

Sean Plummer

Abstract

Singular statistical models-including mixtures, matrix factorization, and neural networks-violate regular asymptotics due to parameter non-identifiability and degenerate Fisher geometry. Although singular learning theory characterizes marginal likelihood behavior through invariants such as the real log canonical threshold and singular fluctuation, these quantities remain difficult to interpret operationally. At the same time, widely used criteria such as WAIC and WBIC appear disconnected from underlying singular geometry. We show that posterior tempering induces a one-parameter deformation of the posterior distribution whose associated observables generate a hierarchy of thermodynamic response functions. A universal covariance identity links derivatives of tempered expectations to posterior fluctuations, placing WAIC, WBIC, and singular fluctuation within a unified response framework. Within this framework, classical quantities from singular learning theory acquire natural thermodynamic interpretations: RLCT governs the leading free-energy slope, singular fluctuation corresponds to curvature of the tempered free energy, and WAIC measures predictive fluctuation. We formalize an observable algebra that quotients out non-identifiable directions, allowing structurally meaningful order parameters to be constructed in singular models. Across canonical singular examples-including symmetric Gaussian mixtures, reduced-rank regression, and overparameterized neural networks-we empirically demonstrate phase-transition-like behavior under tempering. Order parameters collapse, susceptibilities peak, and complexity measures align with structural reorganization in posterior geometry. Our results suggest that thermodynamic response theory provides a natural organizing framework for interpreting complexity, predictive variability, and structural reorganization in singular Bayesian learning.

Thermodynamic Response Functions in Singular Bayesian Models

Abstract

Paper Structure (51 sections, 5 theorems, 48 equations, 4 figures)

This paper contains 51 sections, 5 theorems, 48 equations, 4 figures.

Introduction
Motivation
Core idea: tempering as a deformation
Contributions and paper organization
Observable algebra.
Universal response identities.
Empirical confirmation in canonical singular models.
Toward a thermodynamic framework for singular learning.
Organization.
Background
Singular models and posterior tempering
Minimal results from singular learning theory
Observable Algebra
Observables and posterior expectations
Equivalence relation and induced distribution space
...and 36 more sections

Key Result

Proposition 1

Let $f:\Theta\to\mathbb{R}$ be a measurable function.

Figures (4)

Figure 1: Order parameters and susceptibilities arise from the covariance identity. The response-speed bound links rates of change to fluctuation magnitudes and connects naturally to heat capacity.
Figure 2: Response hierarchy for the mixture symmetry-breaking experiment. The top panel shows the order parameter $m(\beta) = \mathbb{E}_{\beta}[|\mu|]$, which measures the posterior preference for one component mean over the symmetric configuration. At low inverse temperature the posterior explores both symmetric modes. As $\beta$ increases the posterior concentrates on one mode, producing spontaneous symmetry breaking. The middle panel shows the susceptibility $\chi(\beta) = \beta \mathrm{Var}(|\mu|)$, which peaks near the transition where the posterior fluctuates between symmetric configurations. The bottom panel shows the WAIC complexity $\log(1+p_{\mathrm{WAIC}}(\beta)/n)$. Predictive variance decreases as the posterior concentrates, indicating reduced predictive uncertainty once symmetry is broken. The vertical dashed line marks the temperature where susceptibility is maximal.
Figure 3: Response hierarchy for reduced-rank regression. The order parameter $m(\beta)=\mathbb{E}_\beta[s_2(B)]$ tracks the second singular value of the regression matrix. As $\beta$ increases, posterior concentration drives the second singular value toward zero, indicating collapse to a lower-rank model. The susceptibility $\chi(\beta)=\beta\mathrm{Var}(s_2)$ measures fluctuations in the effective rank and peaks near the temperature where rank collapse occurs. The WAIC complexity decreases as the posterior eliminates redundant directions in parameter space. The alignment between susceptibility and predictive complexity illustrates how singular structure controls predictive variability.
Figure 4: Response hierarchy for the neural network hidden-unit collapse experiment. The order parameter $m(\beta)=\mathbb{E}_\beta[N_{\mathrm{eff}}]$ measures the effective number of active hidden units. Although the network contains $H=10$ units, the posterior favors a smaller effective number as $\beta$ increases. Redundant hidden units become inactive due to symmetry and scaling degeneracies. The susceptibility $\chi(\beta)=\beta\mathrm{Var}(N_{\mathrm{eff}})$ peaks when multiple configurations with different numbers of active units coexist. This region corresponds to maximal posterior uncertainty over network representations. The WAIC complexity decreases as redundant units collapse, indicating that predictive uncertainty is highest when the network's internal representation is unstable.

Theorems & Definitions (9)

Proposition 1: Observable representation on $\mathcal{M}$
proof : Proof sketch
Proposition 2: Covariance identity
proof
Proposition 3: Response identities descend to the model image
proof : Proof sketch
Theorem 1: Thermodynamic response hierarchy
Proposition 4: Response-speed bound
proof

Thermodynamic Response Functions in Singular Bayesian Models

Abstract

Thermodynamic Response Functions in Singular Bayesian Models

Authors

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (9)