Interpretable Brain-Inspired Representations Improve RL Performance on Visual Navigation Tasks
Moritz Lange, Raphael C. Engelhardt, Wolfgang Konen, Laurenz Wiskott
TL;DR
The paper addresses the challenge of visual navigation by introducing hierarchical slow feature analysis (hSFA) to extract interpretable location and heading representations directly from visual input. It evaluates hSFA against CNN and PCA baselines by integrating the features into PPO-based RL agents across four Miniworld environments, showing that hSFA can yield robust localization cues and improve navigation efficiency in certain tasks (notably StarMazeArm) while exposing limitations related to symmetries and data coverage. The study highlights the slowness prior as a powerful inductive bias for localization, discusses training and integration constraints, and argues for future work on online end-to-end training, planning integration, and transferability of learned representations. Overall, the work demonstrates neuroscience-inspired representations that enhance explainability and potentially guide the development of more robust, interpretable RL agents for visual navigation.
Abstract
Visual navigation requires a whole range of capabilities. A crucial one of these is the ability of an agent to determine its own location and heading in an environment. Prior works commonly assume this information as given, or use methods which lack a suitable inductive bias and accumulate error over time. In this work, we show how the method of slow feature analysis (SFA), inspired by neuroscience research, overcomes both limitations by generating interpretable representations of visual data that encode location and heading of an agent. We employ SFA in a modern reinforcement learning context, analyse and compare representations and illustrate where hierarchical SFA can outperform other feature extractors on navigation tasks.
