Table of Contents
Fetching ...

Reframing the Expected Free Energy: Four Formulations and a Unification

Théophile Champion, Howard Bowman, Dimitrije Marković, Marek Grześ

TL;DR

The paper formalizes the unification problem for the four active-inference expected free energy formulations by introducing a forecast distribution $F$ and a target distribution $T$ and defining a root $\mathcal{G}_{rt}$. It shows that if the root is taken as the risk over observations plus ambiguity ($C_{ROA}$), all four formulations $C_{RSA}$, $C_{ROA}$, $C_{IGPV}$, and $C_{3E}$ can be recovered, but this imposes constraints on prior preferences over observations that may be incompatible with the likelihood mapping, i.e., $T(o|s)$ and $A$. Alternatively, using a justified root $C_{RSA}$ yields a rigorous justification but only recovers two formulations, with ROA and IGPV as lower-bound constructions. The work clarifies theoretical foundations, demonstrates the compatibility tensions between priors and likelihoods, and outlines avenues for deriving a principled, computable EFE suitable for deep active inference. It highlights the need for alternative factorizations or principled priors to achieve full unification and practical applicability in decision-making under uncertainty.

Abstract

Active inference is a leading theory of perception, learning and decision making, which can be applied to neuroscience, robotics, psychology, and machine learning. Active inference is based on the expected free energy, which is mostly justified by the intuitive plausibility of its formulations, e.g., the risk plus ambiguity and information gain / pragmatic value formulations. This paper seek to formalize the problem of deriving these formulations from a single root expected free energy definition, i.e., the unification problem. Then, we study two settings, each one having its own root expected free energy definition. In the first setting, no justification for the expected free energy has been proposed to date, but all the formulations can be recovered from it. However, in this setting, the agent cannot have arbitrary prior preferences over observations. Indeed, only a limited class of prior preferences over observations is compatible with the likelihood mapping of the generative model. In the second setting, a justification of the root expected free energy definition is known, but this setting only accounts for two formulations, i.e., the risk over states plus ambiguity and entropy plus expected energy formulations.

Reframing the Expected Free Energy: Four Formulations and a Unification

TL;DR

The paper formalizes the unification problem for the four active-inference expected free energy formulations by introducing a forecast distribution and a target distribution and defining a root . It shows that if the root is taken as the risk over observations plus ambiguity (), all four formulations , , , and can be recovered, but this imposes constraints on prior preferences over observations that may be incompatible with the likelihood mapping, i.e., and . Alternatively, using a justified root yields a rigorous justification but only recovers two formulations, with ROA and IGPV as lower-bound constructions. The work clarifies theoretical foundations, demonstrates the compatibility tensions between priors and likelihoods, and outlines avenues for deriving a principled, computable EFE suitable for deep active inference. It highlights the need for alternative factorizations or principled priors to achieve full unification and practical applicability in decision-making under uncertainty.

Abstract

Active inference is a leading theory of perception, learning and decision making, which can be applied to neuroscience, robotics, psychology, and machine learning. Active inference is based on the expected free energy, which is mostly justified by the intuitive plausibility of its formulations, e.g., the risk plus ambiguity and information gain / pragmatic value formulations. This paper seek to formalize the problem of deriving these formulations from a single root expected free energy definition, i.e., the unification problem. Then, we study two settings, each one having its own root expected free energy definition. In the first setting, no justification for the expected free energy has been proposed to date, but all the formulations can be recovered from it. However, in this setting, the agent cannot have arbitrary prior preferences over observations. Indeed, only a limited class of prior preferences over observations is compatible with the likelihood mapping of the generative model. In the second setting, a justification of the root expected free energy definition is known, but this setting only accounts for two formulations, i.e., the risk over states plus ambiguity and entropy plus expected energy formulations.
Paper Structure (18 sections, 1 theorem, 60 equations, 2 figures, 1 table)

This paper contains 18 sections, 1 theorem, 60 equations, 2 figures, 1 table.

Key Result

Theorem 4

Given three sets $\bm{V}_1, \bm{V}_2, \bm{S} \subseteq \mathcal{V}$. If $\bm{V}_1$ and $\bm{V}_2$ are d-separated by $\bm{S}$, then $\bm{V}_1$ and $\bm{V}_2$ are conditionally independent given $\bm{S}$, i.e., $\bm{V}_1 \perp \! \! \! \perp \bm{V}_2 \mid \bm{S}$.

Figures (2)

  • Figure 1: This figure illustrates how matrices can be seen as linear transformations of the Euclidean space. The example taken is a matrix that rotates the Euclidean space by 45 degree clockwise and scales each axis by a factor of 0.5.
  • Figure 2: This figure illustrates the linear transformation corresponding to the $\bm{A}$ matrix in Equation \ref{['eq:a_matrix_appendix_b_1']}. The gray grid represents a set of points in the Euclidean space on which the linear transformation corresponding to the $\bm{A}$ matrix will be applied. The red grid represents where the gray grid lands when applying this linear transformation. The long (blue) segment on the left-hand-side of the figure corresponds to the 1-dimensional simplex in which the prior preferences over states $\bm{C}_s$ lives. The short (blue) segment on the right-hand-side of the figure corresponds to where the 1-dimensional simplex lands when applying the linear transformation, i.e., the class of valid prior preferences over observations $\bm{C}_o$. The green point (✓ on the right-hand-side) corresponds to a point that lives on the projected simplex (i.e., the short blue segment). This point will be projected back to the original 1-dimensional simplex (i.e., the long blue segment) by $\bm{A}^{-1}$. The brown point (✗ on the right-hand-side) corresponds to a point that lives outside the short blue segment. This point will be projected outside of the long blue segment by $\bm{A}^{-1}$.

Theorems & Definitions (4)

  • Definition 1: Collider
  • Definition 2: Blocked trail
  • Definition 3: D-separated
  • Theorem 4: d-separation and independence