A computational approach to visual ecology with deep reinforcement learning
Sacha Sokoloski, Jure Majnik, Philipp Berens
TL;DR
The paper introduces a deep reinforcement learning framework to study visual ecology by framing animal survival as the sole objective in a ViZDoom-based foraging task. It shows that the complexity of the agent's vision model must scale with the visual complexity of food, and that recurrent architectures are crucial to exploiting complex visual inputs on demanding tasks. The authors demonstrate that different brain architectures produce distinct representations of value and behavior, with satiety signals further shaping strategies and reducing nutritional waste. This work provides a computational platform and benchmarks for investigating how perception and value emerge under survival-driven objectives, offering insights into neural coding in visually rich ecological niches.
Abstract
Animal vision is thought to optimize various objectives from metabolic efficiency to discrimination performance, yet its ultimate objective is to facilitate the survival of the animal within its ecological niche. However, modeling animal behavior in complex environments has been challenging. To study how environments shape and constrain visual processing, we developed a deep reinforcement learning framework in which an agent moves through a 3-d environment that it perceives through a vision model, where its only goal is to survive. Within this framework we developed a foraging task where the agent must gather food that sustains it, and avoid food that harms it. We first established that the complexity of the vision model required for survival on this task scaled with the variety and visual complexity of the food in the environment. Moreover, we showed that a recurrent network architecture was necessary to fully exploit complex vision models on the most visually demanding tasks. Finally, we showed how different network architectures learned distinct representations of the environment and task, and lead the agent to exhibit distinct behavioural strategies. In summary, this paper lays the foundation for a computational approach to visual ecology, provides extensive benchmarks for future work, and demonstrates how representations and behaviour emerge from an agent's drive for survival.
