Table of Contents
Fetching ...

Dynamic planning in hierarchical active inference

Matteo Priorelli, Ivilin Peev Stoianov

TL;DR

This study distances itself from traditional views centered on neural networks and reinforcement learning, and points toward a yet unexplored direction in active inference: hybrid representations in hierarchical models.

Abstract

By dynamic planning, we refer to the ability of the human brain to infer and impose motor trajectories related to cognitive decisions. A recent paradigm, active inference, brings fundamental insights into the adaptation of biological organisms, constantly striving to minimize prediction errors to restrict themselves to life-compatible states. Over the past years, many studies have shown how human and animal behaviors could be explained in terms of active inference - either as discrete decision-making or continuous motor control - inspiring innovative solutions in robotics and artificial intelligence. Still, the literature lacks a comprehensive outlook on effectively planning realistic actions in changing environments. Setting ourselves the goal of modeling complex tasks such as tool use, we delve into the topic of dynamic planning in active inference, keeping in mind two crucial aspects of biological behavior: the capacity to understand and exploit affordances for object manipulation, and to learn the hierarchical interactions between the self and the environment, including other agents. We start from a simple unit and gradually describe more advanced structures, comparing recently proposed design choices and providing basic examples. This study distances itself from traditional views centered on neural networks and reinforcement learning, and points toward a yet unexplored direction in active inference: hybrid representations in hierarchical models.

Dynamic planning in hierarchical active inference

TL;DR

This study distances itself from traditional views centered on neural networks and reinforcement learning, and points toward a yet unexplored direction in active inference: hybrid representations in hierarchical models.

Abstract

By dynamic planning, we refer to the ability of the human brain to infer and impose motor trajectories related to cognitive decisions. A recent paradigm, active inference, brings fundamental insights into the adaptation of biological organisms, constantly striving to minimize prediction errors to restrict themselves to life-compatible states. Over the past years, many studies have shown how human and animal behaviors could be explained in terms of active inference - either as discrete decision-making or continuous motor control - inspiring innovative solutions in robotics and artificial intelligence. Still, the literature lacks a comprehensive outlook on effectively planning realistic actions in changing environments. Setting ourselves the goal of modeling complex tasks such as tool use, we delve into the topic of dynamic planning in active inference, keeping in mind two crucial aspects of biological behavior: the capacity to understand and exploit affordances for object manipulation, and to learn the hierarchical interactions between the self and the environment, including other agents. We start from a simple unit and gradually describe more advanced structures, comparing recently proposed design choices and providing basic examples. This study distances itself from traditional views centered on neural networks and reinforcement learning, and points toward a yet unexplored direction in active inference: hybrid representations in hierarchical models.
Paper Structure (14 sections, 48 equations, 18 figures)

This paper contains 14 sections, 48 equations, 18 figures.

Figures (18)

  • Figure 1: (a) Factor graph of a basic unit for static reaching. Variables and factors are indicated by circles and squares, respectively. Hidden states $\bm{x}$ (e.g., the arm angle) generate observations $\bm{o}_p$ (e.g., the arm proprioception) through the likelihood function $\bm{g}_p$, and their 1st derivatives $\bm{x}^\prime$ (e.g., the arm velocity) through a dynamics function $\bm{f}$. In contrast to optimal control, here action follows observation prediction errors arising from a simple attractor $\bm{\rho}$ embedded in the model dynamics, or from a prior belief $\bm{\eta}_x$ over the arm angle. (b) Agent's generative model.
  • Figure 2: (a) In this task, the agent (a single DoF) has to reach a target angle represented by the red circle. Estimated and real arms are displayed in cyan and blue, respectively. Here, $\bm{\pi}_{\eta,x} = 0$, $\bm{\rho} = 120$°, and $\bm{\mu}_x$ was initialized to $-40$°. The time step is indicated in the bottom left corner of each frame. Since the belief was initialized at a negative value, the likelihood initially pulls the arm toward the wrong direction before adapting to the dynamics attractor. (b) The top graph shows the evolution of the real angle $\bm{x}$, its belief $\bm{\mu}_x$, and the target angle $\bm{\rho}$. The middle graph shows the evolution of the belief of the velocity $\bm{\mu}_x^\prime$ and the belief derivative $\dot{\bm{\mu}}_x$. The bottom graph shows the evolution of all the components that comprise the belief update: the belief of the velocity $\bm{\mu}_x^\prime$, the likelihood gradient $\partial_{x} \bm{g}^T \bm{\Pi}_{o,p} \bm{\varepsilon}_{o,p}$, the dynamics gradient $\partial_{x} \bm{f}^T \bm{\Pi}_x \bm{\varepsilon}_x$, and the weighted dynamics prediction error $-\bm{\Pi}_x \bm{\varepsilon}_x$. The latter has been plotted to compare its magnitude with the other components, although affecting the 1st temporal order.
  • Figure 3: (a) The target is now encoded in the hidden causes $\bm{v}$, generating a dynamic attractor for object tracking. In fact, both hidden states and hidden causes generate predictions through proprioceptive and visual likelihood functions $\bm{g}_p$ and $\bm{g}_v$, and both concur in estimating the 1st-order hidden states $\bm{x}^\prime$. (b) Agent's generative model.
  • Figure 4: (a) In this task, the agent has to track a target angle rotating at a constant velocity. Estimated and real targets are displayed in purple and red, respectively. Here, $\bm{\Pi}_{\eta,x} = 0$, $\bm{\Pi}_{\eta,v} = 0$, $\bm{v}$ was initialized to $60$°, and both $\bm{\mu}_x$ and $\bm{\mu}_v$ were initialized to $0$°. Here, the belief of the hidden causes pulls the belief of the hidden states with it while approaching the real target angle. (b) The top graph shows the evolution of the real angle $\bm{x}$, its belief $\bm{\mu}_x$, the target angle $\bm{v}$, and its belief $\bm{\mu}_v$. The middle graph shows the evolution of the belief of the velocity $\bm{\mu}_x^\prime$ and the belief derivative $\dot{\bm{\mu}}_x$, as before. The bottom graph shows the evolution of all the components that comprise the hidden causes update: the likelihood gradient $\partial_{v} \bm{g}_v^T \bm{\Pi}_{o,v} \bm{\varepsilon}_{o,v}$, and the dynamics gradient $\partial_{v} \bm{f}^T \bm{\Pi}_x \bm{\varepsilon}_x$. Note how, in the middle plot, the estimated 1st temporal order stabilizes to a non-zero value as the agent rotates with a constant angular velocity.
  • Figure 5: (a) Factor graph of the unit with object affordances. Hidden states are factorized into independent components that encode the actual bodily states and potential configurations related to the objects. The first component $\bm{x}_0$ generates proprioceptive predictions, while the successive components generate visual predictions of the objects. Every hidden cause $v_m$ now defines an attractor gain expressing the strength of an agent's intention (encoded as a distinct evolution of the world). These are combined to produce a trajectory $\bm{\eta}_{x}^\prime$, comprising all the body configurations $\bm{x}_n$. The transition between intentions can be achieved by a higher-level prior, e.g., a belief of tactile sensations. The weights $\bm{W}_m$ of the intention can be used, e.g., to track moving objects, while the bias $\bm{b}_m$ realizes a static configuration. See Priorelli2023Priorelli2023d for more details. (b) Agent's generative model.
  • ...and 13 more figures