Curiosity is Knowledge: Self-Consistent Learning and No-Regret Optimization with Active Inference
Yingke Li, Anjali Parashar, Enlu Zhou, Chuchu Fan
TL;DR
The paper addresses how to robustly balance exploration and exploitation in sequential decision-making by using active inference and minimizing the Expected Free Energy. It shows that a sufficient curiosity level, captured by a lower bound on the curiosity coefficient β_t, yields both posterior consistency in learning and no-regret optimization, unifying Bayesian Optimization and Bayesian Experimental Design within a single framework. Theoretical guarantees are provided via two theorems: posterior consistency and a GP-based no-regret bound, with practical guidelines for adaptive curiosity scheduling and energy function design. The results are validated through synthetic experiments and real-world Hybrid learning–optimization tasks, highlighting the practical impact for robotics, adaptive experimentation, and complex design problems.
Abstract
Active inference (AIF) unifies exploration and exploitation by minimizing the Expected Free Energy (EFE), balancing epistemic value (information gain) and pragmatic value (task performance) through a curiosity coefficient. Yet it has been unclear when this balance yields both coherent learning and efficient decision-making: insufficient curiosity can drive myopic exploitation and prevent uncertainty resolution, while excessive curiosity can induce unnecessary exploration and regret. We establish the first theoretical guarantee for EFE-minimizing agents, showing that a single requirement--sufficient curiosity--simultaneously ensures self-consistent learning (Bayesian posterior consistency) and no-regret optimization (bounded cumulative regret). Our analysis characterizes how this mechanism depends on initial uncertainty, identifiability, and objective alignment, thereby connecting AIF to classical Bayesian experimental design and Bayesian optimization within one theoretical framework. We further translate these theories into practical design guidelines for tuning the epistemic-pragmatic trade-off in hybrid learning-optimization problems, validated through real-world experiments.
