Emergent Braitenberg-style Behaviours for Navigating the ViZDoom `My Way Home' Labyrinth
Caleidgh Bayer, Robert J. Smith, Malcolm I. Heywood
TL;DR
This paper tackles navigation in a partially observable, high-dimensional labyrinth by exploring whether Braitenberg-style reactive behaviours can emerge from simple, coevolved programs. It contrasts a memoryless DQN baseline with Tangled Program Graphs (TPG), showing that small, modular program graphs that index only a tiny fraction of the state space can achieve robust navigation in ViZDoom's My Way Home task. The results reveal Braitenberg-like policies—derived from context-action program ensembles—without explicit convolutional processing or memory, outperforming the DL baseline under the chosen setup. The work highlights the potential of structured, evolutionary programming to yield simple, interpretable navigation strategies in complex environments and suggests avenues for studying geometry-driven effects and generalization.
Abstract
The navigation of complex labyrinths with tens of rooms under visual partially observable state is typically addressed using recurrent deep reinforcement learning architectures. In this work, we show that navigation can be achieved through the emergent evolution of a simple Braitentberg-style heuristic that structures the interaction between agent and labyrinth, i.e. complex behaviour from simple heuristics. To do so, the approach of tangled program graphs is assumed in which programs cooperatively coevolve to develop a modular indexing scheme that only employs 0.8\% of the state space. We attribute this simplicity to several biases implicit in the representation, such as the use of pixel indexing as opposed to deploying a convolutional kernel or image processing operators.
