Mean-Field Microcanonical Gradient Descent
Marcus Häggbom, Morten Karlsmark, Joakim Andén
TL;DR
The paper addresses entropy collapse in microcanonical gradient descent for high-dimensional energy-based sampling by proposing a mean-field variant (MF--MGDM) that updates a batch of samples via the batch mean energy. It provides a theoretical entropy bound showing that increasing the batch size tightens entropy loss, and demonstrates empirically that MF--MGDM preserves entropy better than MGDM while maintaining competitive likelihood on synthetic AR$(p)$, CIR, and real financial time series (e.g., S&P 500, yields). The approach combines micro- and macrocanonical concepts with efficient gradient-based sampling, yielding improved KL trade-offs and more stable entropy dynamics. Limitations include stationarity assumptions, differentiability of the energy function, and the need for further exploration of forward KL and richer energy features.
Abstract
Microcanonical gradient descent is a sampling procedure for energy-based models allowing for efficient sampling of distributions in high dimension. It works by transporting samples from a high-entropy distribution, such as Gaussian white noise, to a low-energy region using gradient descent. We put this model in the framework of normalizing flows, showing how it can often overfit by losing an unnecessary amount of entropy in the descent. As a remedy, we propose a mean-field microcanonical gradient descent that samples several weakly coupled data points simultaneously, allowing for better control of the entropy loss while paying little in terms of likelihood fit. We study these models in the context of financial time series, illustrating the improvements on both synthetic and real data.
