Generalizable Reinforcement Learning with Biologically Inspired Hyperdimensional Occupancy Grid Maps for Exploration and Goal-Directed Path Planning
Shay Snyder, Ryan Shea, Andrew Capodieci, David Gorsich, Maryam Parsa
TL;DR
The paper addresses the challenge of generalizing reinforcement learning policies for exploration and path planning by comparing a biologically inspired hyperdimensional occupancy grid mapping (VSA-OGM) against traditional OGM methods (BHM). By wrapping LiDAR data into an OGM format and using a PPO-based RL pipeline, the study demonstrates that VSA-OGM achieves comparable learning performance while markedly improving generalization to unseen environments, with gains up to approximately 53% on MarsExplorer and 47% on RaceCarGym. However, these generalization benefits come with increased computational and memory demands, including higher latency and larger memory footprints for VSA-OGM, especially under high-density LiDAR conditions. The results support VSA-OGM as a promising neuromorphic-compatible alternative for robust deployment in diverse environments, and point to future work on reducing encoding complexity and extending to model-based RL. The work thus contributes to scalable, generalizable perception-to-control pipelines for real-world autonomous systems.
Abstract
Real-time autonomous systems utilize multi-layer computational frameworks to perform critical tasks such as perception, goal finding, and path planning. Traditional methods implement perception using occupancy grid mapping (OGM), segmenting the environment into discretized cells with probabilistic information. This classical approach is well-established and provides a structured input for downstream processes like goal finding and path planning algorithms. Recent approaches leverage a biologically inspired mathematical framework known as vector symbolic architectures (VSA), commonly known as hyperdimensional computing, to perform probabilistic OGM in hyperdimensional space. This approach, VSA-OGM, provides native compatibility with spiking neural networks, positioning VSA-OGM as a potential neuromorphic alternative to conventional OGM. However, for large-scale integration, it is essential to assess the performance implications of VSA-OGM on downstream tasks compared to established OGM methods. This study examines the efficacy of VSA-OGM against a traditional OGM approach, Bayesian Hilbert Maps (BHM), within reinforcement learning based goal finding and path planning frameworks, across a controlled exploration environment and an autonomous driving scenario inspired by the F1-Tenth challenge. Our results demonstrate that VSA-OGM maintains comparable learning performance across single and multi-scenario training configurations while improving performance on unseen environments by approximately 47%. These findings highlight the increased generalizability of policy networks trained with VSA-OGM over BHM, reinforcing its potential for real-world deployment in diverse environments.
