Curiosity is Knowledge: Self-Consistent Learning and No-Regret Optimization with Active Inference

Yingke Li; Anjali Parashar; Enlu Zhou; Chuchu Fan

Curiosity is Knowledge: Self-Consistent Learning and No-Regret Optimization with Active Inference

Yingke Li, Anjali Parashar, Enlu Zhou, Chuchu Fan

TL;DR

The paper addresses how to robustly balance exploration and exploitation in sequential decision-making by using active inference and minimizing the Expected Free Energy. It shows that a sufficient curiosity level, captured by a lower bound on the curiosity coefficient β_t, yields both posterior consistency in learning and no-regret optimization, unifying Bayesian Optimization and Bayesian Experimental Design within a single framework. Theoretical guarantees are provided via two theorems: posterior consistency and a GP-based no-regret bound, with practical guidelines for adaptive curiosity scheduling and energy function design. The results are validated through synthetic experiments and real-world Hybrid learning–optimization tasks, highlighting the practical impact for robotics, adaptive experimentation, and complex design problems.

Abstract

Active inference (AIF) unifies exploration and exploitation by minimizing the Expected Free Energy (EFE), balancing epistemic value (information gain) and pragmatic value (task performance) through a curiosity coefficient. Yet it has been unclear when this balance yields both coherent learning and efficient decision-making: insufficient curiosity can drive myopic exploitation and prevent uncertainty resolution, while excessive curiosity can induce unnecessary exploration and regret. We establish the first theoretical guarantee for EFE-minimizing agents, showing that a single requirement--sufficient curiosity--simultaneously ensures self-consistent learning (Bayesian posterior consistency) and no-regret optimization (bounded cumulative regret). Our analysis characterizes how this mechanism depends on initial uncertainty, identifiability, and objective alignment, thereby connecting AIF to classical Bayesian experimental design and Bayesian optimization within one theoretical framework. We further translate these theories into practical design guidelines for tuning the epistemic-pragmatic trade-off in hybrid learning-optimization problems, validated through real-world experiments.

Curiosity is Knowledge: Self-Consistent Learning and No-Regret Optimization with Active Inference

TL;DR

Abstract

Paper Structure (32 sections, 7 theorems, 58 equations, 4 figures, 2 tables)

This paper contains 32 sections, 7 theorems, 58 equations, 4 figures, 2 tables.

Introduction
Preliminaries
Bayesian Optimization
Bayesian Experimental Design
Active Inference Bridges Learning and Optimization
Performance Evaluation in Learning and Optimization
Self-Consistent Learning
Interpretation of Assumptions
Interpretation of Convergence Rate
Practical Usability and Implications
No-Regret Optimization
Interpretation of Assumptions
Interpretation of Cumulative Regret Bound
Practical Usability and Implications
Experiments for Theorem Validation
...and 17 more sections

Key Result

Theorem 5.1

Let $s$ be a discrete latent parameter of the model with parameter space $\mathcal{S}$, and $s^{\ast} \in \mathcal{S}$ denote the true (data-generating) parameter. At each iteration $t$, the query $x_{t} \in \mathcal{X}$ is chosen according to the AIF policy: where $I(s; (x,y) \mid \mathcal{D}_{t-1})$ is the conditional mutual information between $s$ and the next observation pair $(x,y)$, and $h_

Figures (4)

Figure 1: Discrete sandbox to validate Theorem \ref{['thm:self-consistency']}. Error bars represent $\pm 0.2$ std over 5 seeds.
Figure 2: 1D GP bandit to validate Theorem \ref{['thm:URB']}. Error bars represent $\pm 0.2$ std over 5 seeds.
Figure 3: Constrained system identification on environmental monitoring in 2d plume fields. Error bars represent $\pm 0.2$ std over 5 seeds.
Figure 4: Composite BO on distributed energy resource allocation in power grids. Error bars represent $\pm 0.2$ std over 5 seeds.

Theorems & Definitions (16)

Definition 4.1: Posterior Consistency
Definition 4.2: Regret Function
Definition 4.3: Potential Energy Function
Theorem 5.1: Posterior Consistency in AIF
proof
Theorem 6.1: Cumulative Regret Bound in AIF
proof
proof
Lemma 2.1: Lemma 5.3 in Srinivas2009GaussianDesign
Lemma 2.2: Lemma 5.5 in Srinivas2009GaussianDesign
...and 6 more

Curiosity is Knowledge: Self-Consistent Learning and No-Regret Optimization with Active Inference

TL;DR

Abstract

Curiosity is Knowledge: Self-Consistent Learning and No-Regret Optimization with Active Inference

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (16)