Intrinsic Goals for Autonomous Agents: Model-Based Exploration in Virtual Zebrafish Predicts Ethological Behavior and Whole-Brain Dynamics
Reece Keller, Alyn Kirsch, Felix Pei, Xaq Pitkow, Leo Kozachkov, Aran Nayebi
TL;DR
The paper introduces 3M-Progress, a model-based intrinsic drive that uses a fixed ethological memory prior and a learnable online memory to drive autonomous exploration in a virtual zebrafish setup. By partitioning behavior into niche-seeking and niche-avoidance through model-memory-mismatch, the approach yields stable, animal-like state transitions and tightly predicts whole-brain neural-glial activity, including astrocyte-mediated dynamics. The authors demonstrate that 3M-Progress replicates observed zebrafish behaviors and explains most of the explainable variance in neural-glial recordings, presenting the first goal-driven, self-supervised embodied agent that forecasts brain data. This work provides a computational framework linking intrinsic motivation to neural-glial computation and offers a foundation for designing autonomous artificial agents with animal-like autonomy. It also highlights two core principles for autonomy—avoiding uncontrollable stimuli and converging to robust policies—along with avenues for extending the framework to richer ecological scenarios and more detailed neurobiological mechanisms.
Abstract
Autonomy is a hallmark of animal intelligence, enabling adaptive and intelligent behavior in complex environments without relying on external reward or task structure. Existing reinforcement learning approaches to exploration in reward-free environments, including a class of methods known as model-based intrinsic motivation, exhibit inconsistent exploration patterns and do not converge to an exploratory policy, thus failing to capture robust autonomous behaviors observed in animals. Moreover, systems neuroscience has largely overlooked the neural basis of autonomy, focusing instead on experimental paradigms where animals are motivated by external reward rather than engaging in ethological, naturalistic and task-independent behavior. To bridge these gaps, we introduce a novel model-based intrinsic drive explicitly designed after the principles of autonomous exploration in animals. Our method (3M-Progress) achieves animal-like exploration by tracking divergence between an online world model and a fixed prior learned from an ecological niche. To the best of our knowledge, we introduce the first autonomous embodied agent that predicts brain data entirely from self-supervised optimization of an intrinsic goal -- without any behavioral or neural training data -- demonstrating that 3M-Progress agents capture the explainable variance in behavioral patterns and whole-brain neural-glial dynamics recorded from autonomously behaving larval zebrafish, thereby providing the first goal-driven, population-level model of neural-glial computation. Our findings establish a computational framework connecting model-based intrinsic motivation to naturalistic behavior, providing a foundation for building artificial agents with animal-like autonomy.
