Table of Contents
Fetching ...

Active Inference is a Subtype of Variational Inference

Wouter W. L. Nuijten, Mykola Lukashchuk

TL;DR

The paper tackles planning under uncertainty by addressing the computational bottleneck of Expected Free Energy (EFE) minimization in Active Inference. It formally recasts EFE-based planning as entropy-corrected variational inference, presenting an adjusted objective $F_{ ilde p}[q]$ that combines the standard VI objective $F_p[q]$ with entropic corrections, thereby linking Active Inference to Planning-as-Inference. A novel message-passing scheme on a region-extended Bethe factor graph is derived, introducing a channel $r_{y|x heta,t}$ to localize entropy terms and enabling scalable, locally solvable updates for factored-state MDPs. The work discusses theoretical properties and practical limitations (degeneracy, quadratic scaling) and advocates hierarchical state-space partitioning to achieve scalability, bridging Active Inference with scalable variational inference for uncertainty-rich decision-making.

Abstract

Automated decision-making under uncertainty requires balancing exploitation and exploration. Classical methods treat these separately using heuristics, while Active Inference unifies them through Expected Free Energy (EFE) minimization. However, EFE minimization is computationally expensive, limiting scalability. We build on recent theory recasting EFE minimization as variational inference, formally unifying it with Planning-as-Inference and showing the epistemic drive as a unique entropic contribution. Our main contribution is a novel message-passing scheme for this unified objective, enabling scalable Active Inference in factored-state MDPs and overcoming high-dimensional planning intractability.

Active Inference is a Subtype of Variational Inference

TL;DR

The paper tackles planning under uncertainty by addressing the computational bottleneck of Expected Free Energy (EFE) minimization in Active Inference. It formally recasts EFE-based planning as entropy-corrected variational inference, presenting an adjusted objective that combines the standard VI objective with entropic corrections, thereby linking Active Inference to Planning-as-Inference. A novel message-passing scheme on a region-extended Bethe factor graph is derived, introducing a channel to localize entropy terms and enabling scalable, locally solvable updates for factored-state MDPs. The work discusses theoretical properties and practical limitations (degeneracy, quadratic scaling) and advocates hierarchical state-space partitioning to achieve scalability, bridging Active Inference with scalable variational inference for uncertainty-rich decision-making.

Abstract

Automated decision-making under uncertainty requires balancing exploitation and exploration. Classical methods treat these separately using heuristics, while Active Inference unifies them through Expected Free Energy (EFE) minimization. However, EFE minimization is computationally expensive, limiting scalability. We build on recent theory recasting EFE minimization as variational inference, formally unifying it with Planning-as-Inference and showing the epistemic drive as a unique entropic contribution. Our main contribution is a novel message-passing scheme for this unified objective, enabling scalable Active Inference in factored-state MDPs and overcoming high-dimensional planning intractability.

Paper Structure

This paper contains 17 sections, 12 theorems, 77 equations, 2 figures, 1 table.

Key Result

Theorem 1

The variational objective presented in de_vries_expected_2025 (presented in subsec:manipulated-vfe) can be rearranged in the following way: where $F_p[q]$ is the Variational Free Energy associated with the generative model.

Figures (2)

  • Figure 1: Factor graph representation of the generative model \ref{['eq:generative_model']}. Nodes (boxes) represent factors from the generative model: $f_\theta$ is the prior on parameters, $f_{x_0}$ is the initial state prior, $f_{\mathrm{dyn},t}$ represents the dynamics $p(x_t|x_{t-1},\theta,u_t)$, $f_{y,t}$ represents observations $p(y_t|x_t,\theta)$, and $f_{u,t}$, $f_{x,\hat{p},t}$, $f_{y,\hat{p},t}$ represent action priors and goal priors respectively. Edges (lines) represent random variables: $\theta$ (parameters), $x_t$ (states), $y_t$ (observations), and $u_t$ (actions). In the Bethe approximation, each node$a$ maintains a local belief $q_a(s_a)$ over its scope (the variables connected to it), while each edge$i$ maintains a singleton belief $q_i(s_i)$. These local beliefs must satisfy consistency constraints \ref{['eq:local-consistency-constraint']}. This factorization enables local optimization scheme (message passing): rather than optimizing a single global distribution $q(y,x,\theta,u)$, we optimize a collection of local beliefs that communicate through messages.
  • Figure 2: Time-slice factor graph corresponding to the scheme introduced in \ref{['sec:message_passing']}. To form a full generative model to run inference, we chain $T$ of these slices to form a terminated factor graph.

Theorems & Definitions (23)

  • Theorem 1
  • proof
  • Theorem 2: The stationary scheme for Active Inference
  • Lemma 3
  • proof
  • Lemma 4
  • proof
  • Lemma 5
  • proof
  • proof
  • ...and 13 more