Table of Contents
Fetching ...

The Work Capacity of Channels with Memory: Maximum Extractable Work in Percept-Action Loops

Lukas J. Fiderer, Paul C. Barth, Isaac D. Smith, Hans J. Briegel

TL;DR

This work develops a thermodynamic-information framework for percept-action loops, modeling both agent and environment as finite-memory hidden Markov channels and introducing work capacity $C^{\mathrm{work}}$ as the maximal average extractable work rate. The authors derive a general expression for extractable work $W$ and assess the work capacity across channel classes, showing that in loops with feedback the conventional design principles—maximizing predictive power and forgetting past actions—can fail to be optimal. A key result is that, for unifilar product environments, efficient work extraction requires a balance between action entropy and percept predictability, while in more general settings maximal predictability may come at a thermodynamic cost. These findings reveal a fundamental tension between prediction and energy efficiency in active learning and adaptive systems, suggesting new energetic design principles beyond passive observation and opening avenues for future quantum extensions and goal-directed agents.

Abstract

Predicting future observations plays a central role in machine learning, biology, economics, and many other fields. It lies at the heart of organizational principles such as the variational free energy principle and has even been shown -- based on the second law of thermodynamics -- to be necessary for reaching the fundamental energetic limits of sequential information processing. While the usefulness of the predictive paradigm is undisputed, complex adaptive systems that interact with their environment are more than just predictive machines: they have the power to act upon their environment and cause change. In this work, we develop a framework to analyze the thermodynamics of information processing in percept-action loops -- a model of agent-environment interaction -- allowing us to investigate the thermodynamic implications of actions and percepts on equal footing. To this end, we introduce the concept of work capacity -- the maximum rate at which an agent can expect to extract work from its environment. Our results reveal that neither of two previously established design principles for work-efficient agents -- maximizing predictive power and forgetting past actions -- remains optimal in environments where actions have observable consequences. Instead, a trade-off emerges: work-efficient agents must balance prediction and forgetting, as remembering past actions can reduce the available free energy. This highlights a fundamental departure from the thermodynamics of passive observation, suggesting that prediction and energy efficiency may be at odds in active learning systems.

The Work Capacity of Channels with Memory: Maximum Extractable Work in Percept-Action Loops

TL;DR

This work develops a thermodynamic-information framework for percept-action loops, modeling both agent and environment as finite-memory hidden Markov channels and introducing work capacity as the maximal average extractable work rate. The authors derive a general expression for extractable work and assess the work capacity across channel classes, showing that in loops with feedback the conventional design principles—maximizing predictive power and forgetting past actions—can fail to be optimal. A key result is that, for unifilar product environments, efficient work extraction requires a balance between action entropy and percept predictability, while in more general settings maximal predictability may come at a thermodynamic cost. These findings reveal a fundamental tension between prediction and energy efficiency in active learning and adaptive systems, suggesting new energetic design principles beyond passive observation and opening avenues for future quantum extensions and goal-directed agents.

Abstract

Predicting future observations plays a central role in machine learning, biology, economics, and many other fields. It lies at the heart of organizational principles such as the variational free energy principle and has even been shown -- based on the second law of thermodynamics -- to be necessary for reaching the fundamental energetic limits of sequential information processing. While the usefulness of the predictive paradigm is undisputed, complex adaptive systems that interact with their environment are more than just predictive machines: they have the power to act upon their environment and cause change. In this work, we develop a framework to analyze the thermodynamics of information processing in percept-action loops -- a model of agent-environment interaction -- allowing us to investigate the thermodynamic implications of actions and percepts on equal footing. To this end, we introduce the concept of work capacity -- the maximum rate at which an agent can expect to extract work from its environment. Our results reveal that neither of two previously established design principles for work-efficient agents -- maximizing predictive power and forgetting past actions -- remains optimal in environments where actions have observable consequences. Instead, a trade-off emerges: work-efficient agents must balance prediction and forgetting, as remembering past actions can reduce the available free energy. This highlights a fundamental departure from the thermodynamics of passive observation, suggesting that prediction and energy efficiency may be at odds in active learning systems.

Paper Structure

This paper contains 26 sections, 23 theorems, 165 equations, 24 figures, 1 table.

Key Result

Theorem 1

Let $\normalfont\texttt{agt} {\,\mathrel{\vcenter{ \ialign{\crcr {}\hbox{$\rightarrow$}\crcr {}\hbox{$\leftarrow$}\crcr } }}\,} \texttt{env}$ be any percept-action loop. If the environment channel is unifilar, then there exists an a.m. predictive agent model $\normalfont\texttt{agtM}$ for $\normal

Figures (24)

  • Figure 1: Tape setting (a) and percept-action loop setting (b). In the tape setting, (a), an agent processes symbols $S_t$ from a pre-existing tape. Outgoing symbols $A_t$ do not influence future inputs. In the percept-action loop setting, (b), the agent interacts with an environment (Env.) in rounds. In round $t$, the agent provides an action symbol $A_t$ and receives a percept symbol $S_t$ from the environment. Both the agent and environment can have memory, allowing future percepts to depend on past actions.
  • Figure 2: Circuit representation of percept-action loops, with time flowing from left to right. (a) The agent and environment are modeled as channels with memory. (b) The agent and environment are represented by their hidden Markov models, characterized by finite adaptive memories $M_t$ and $Z_t$. The transition matrices $\Theta$ and $\Phi$ remain fixed over time.
  • Figure 3: Bayesian network for a percept-action loop. Shown is a fragment for rounds $t-1$, $t$, and the beginning of round $t+1$. This type of Bayesian network plays an important role in the information-theoretic framework underlying our results (see \ref{['supp:5']} for details). Note that to faithfully represent the dynamics of the agent and environment, auxiliary nodes (gray and reduced in size) are included. The colorized nodes illustrate the condition for an agent to be maximally predictive in round $t$: the agent's memory (blue) must store all information from past actions and percepts $S_{0:t}A_{0:t+1}$ (red) that is relevant for predicting the current percept $S_t$ (green).
  • Figure 4: An agent agt interacting with the cascade of two environment channel $\texttt{env}_1$ and $\texttt{env}_2$.
  • Figure 5: A memoryless invariant environment with binary percept and action alphabets, $\mathcal{A}=\mathcal{S}=\{0,1\}$. The transition labels follow the scheme percept$\,|\,$action$~:~$transition probability. The transition on the left (right) corresponds to action "0" (respectively, "1").
  • ...and 19 more figures

Theorems & Definitions (41)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Theorem 1
  • Theorem 2
  • Definition 5
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • ...and 31 more