Derivation of the Variational Bayes Equations
Alianna J. Maren
TL;DR
This work analyzes how variational Bayes equations can be derived and translated between Beal's canonical formulation and Friston's active inference framework, with explicit attention to a Markov blanket separation between external and representational systems. It presents two equivalent expressions for the variational free energy $F$, namely $F = E_q[L(x)] - H[q(\psi|r)]$ and $F = L(s,a,r) + D_{KL}[q(\psi|r)||p(\psi|s,a,r)]$, and clarifies the roles of the log-likelihood $L$, entropy $H$, and the reverse KL divergence. The paper provides a Rosetta-stone mapping across Beal, Friston, and Blei nomenclatures and explains how integrating over the model space can be framed consistently within active inference. It then extends the framework to a computational engine using the 2-D Cluster Variation Method (CVM) to compute free-energy minima for both external and representational systems, enabling parameter learning via CVM enthalpy and entropy terms. Overall, the work lays out a scalable path for applying variational Bayes and active inference to multi-scale systems via a CVM-based computational engine, with implications for future neural and machine learning architectures.
Abstract
The derivation of key equations for the variational Bayes approach is well-known in certain circles. However, translating the fundamental derivations (e.g., as found in Beal's work) to Friston's notation is somewhat delicate. Further, the notion of using variational Bayes in the context of a system with a Markov blanket requires special attention. This Technical Report presents the derivation in detail. It further illustrates how the variational Bayes method provides a framework for a new computational engine, incorporating the 2-D cluster variation method (CVM), which provides a necessary free energy equation that can be minimized across both the external and representational systems' states, respectively.
