Table of Contents
Fetching ...

Universal AI maximizes Variational Empowerment

Yusuke Hayashi, Koichi Takahashi

TL;DR

The paper tackles the challenge of achieving robust exploration in Bayes-optimal universal AI by unifying AIXI with variational empowerment and framing Self-AIXI as an intrinsic-motivation driven learner. It shows that Self-AIXI's mixture-policy regularization acts as a variational empowerment bonus and proves that AIXI-like planning can be seen as minimizing the expected variational free energy $\mathcal{F}$, thereby integrating goal-directed behavior with curiosity under Active Inference. A key contribution is the decomposition of the empowerment objective into a negative KL-regularization term plus mutual information, demonstrating that empowerment naturally arises when pursuing uncertainty reduction and high optionability. The results justify why empowerment can explain power-seeking tendencies as an intrinsic drive, not just as an instrumental means to reward, and establish that, under suitable conditions, Self-AIXI asymptotically converges to the Bayes-optimal policy $\pi^{*}$, inheriting Legg-Hutter intelligence. While the work is theoretical and relies on idealized assumptions (e.g., unbounded computation, tractable empowerment approximations), it provides a principled framework for analyzing intrinsic motivations in universal agents and informs discussions on AI safety and controllability.

Abstract

This paper presents a theoretical framework unifying AIXI -- a model of universal AI -- with variational empowerment as an intrinsic drive for exploration. We build on the existing framework of Self-AIXI -- a universal learning agent that predicts its own actions -- by showing how one of its established terms can be interpreted as a variational empowerment objective. We further demonstrate that universal AI's planning process can be cast as minimizing expected variational free energy (the core principle of active Inference), thereby revealing how universal AI agents inherently balance goal-directed behavior with uncertainty reduction curiosity). Moreover, we argue that power-seeking tendencies of universal AI agents can be explained not only as an instrumental strategy to secure future reward, but also as a direct consequence of empowerment maximization -- i.e. the agent's intrinsic drive to maintain or expand its own controllability in uncertain environments. Our main contribution is to show how these intrinsic motivations (empowerment, curiosity) systematically lead universal AI agents to seek and sustain high-optionality states. We prove that Self-AIXI asymptotically converges to the same performance as AIXI under suitable conditions, and highlight that its power-seeking behavior emerges naturally from both reward maximization and curiosity-driven exploration. Since AIXI can be view as a Bayes-optimal mathematical formulation for Artificial General Intelligence (AGI), our result can be useful for further discussion on AI safety and the controllability of AGI.

Universal AI maximizes Variational Empowerment

TL;DR

The paper tackles the challenge of achieving robust exploration in Bayes-optimal universal AI by unifying AIXI with variational empowerment and framing Self-AIXI as an intrinsic-motivation driven learner. It shows that Self-AIXI's mixture-policy regularization acts as a variational empowerment bonus and proves that AIXI-like planning can be seen as minimizing the expected variational free energy , thereby integrating goal-directed behavior with curiosity under Active Inference. A key contribution is the decomposition of the empowerment objective into a negative KL-regularization term plus mutual information, demonstrating that empowerment naturally arises when pursuing uncertainty reduction and high optionability. The results justify why empowerment can explain power-seeking tendencies as an intrinsic drive, not just as an instrumental means to reward, and establish that, under suitable conditions, Self-AIXI asymptotically converges to the Bayes-optimal policy , inheriting Legg-Hutter intelligence. While the work is theoretical and relies on idealized assumptions (e.g., unbounded computation, tractable empowerment approximations), it provides a principled framework for analyzing intrinsic motivations in universal agents and informs discussions on AI safety and controllability.

Abstract

This paper presents a theoretical framework unifying AIXI -- a model of universal AI -- with variational empowerment as an intrinsic drive for exploration. We build on the existing framework of Self-AIXI -- a universal learning agent that predicts its own actions -- by showing how one of its established terms can be interpreted as a variational empowerment objective. We further demonstrate that universal AI's planning process can be cast as minimizing expected variational free energy (the core principle of active Inference), thereby revealing how universal AI agents inherently balance goal-directed behavior with uncertainty reduction curiosity). Moreover, we argue that power-seeking tendencies of universal AI agents can be explained not only as an instrumental strategy to secure future reward, but also as a direct consequence of empowerment maximization -- i.e. the agent's intrinsic drive to maintain or expand its own controllability in uncertain environments. Our main contribution is to show how these intrinsic motivations (empowerment, curiosity) systematically lead universal AI agents to seek and sustain high-optionality states. We prove that Self-AIXI asymptotically converges to the same performance as AIXI under suitable conditions, and highlight that its power-seeking behavior emerges naturally from both reward maximization and curiosity-driven exploration. Since AIXI can be view as a Bayes-optimal mathematical formulation for Artificial General Intelligence (AGI), our result can be useful for further discussion on AI safety and the controllability of AGI.

Paper Structure

This paper contains 20 sections, 30 equations, 1 table.