Spiking World Model with Multi-Compartment Neurons for Model-based Reinforcement Learning
Yinqian Sun, Feifei Zhao, Mingyang Lv, Yi Zeng
TL;DR
This work tackles model-based reinforcement learning with spiking neural networks by introducing a multi-compartment neuron (MCN) that performs nonlinear dendritic integration across apical and basal dendrites. Building on this, the Spiking-WM framework implements a fully spiking Dreamer-like world model, using an MCN-based state-space representation and spiking encoders/decoders for both visual and sequential data. Across DeepMind Control tasks and long-sequence speech benchmarks (SHD, TIMIT, LibriSpeech 100h), Spiking-WM achieves competitive performance with GRU-based, ANN-driven world models and surpasses existing SNN approaches, highlighting the importance of dendritic computations for memory and decision-making in SNNs. The results suggest that cooperative dendritic processing can yield robust, energy-efficient models for model-based RL, with broad implications for biologically plausible AI and neuromorphic computing, and are supported by extensive analyses of membrane dynamics and spiking activity. All code and models are available via the BrainCog Embot platform, facilitating further research in dendrite-aware SNNs for sequential decision tasks.
Abstract
Brain-inspired spiking neural networks (SNNs) have garnered significant research attention in algorithm design and perception applications. However, their potential in the decision-making domain, particularly in model-based reinforcement learning, remains underexplored. The difficulty lies in the need for spiking neurons with long-term temporal memory capabilities, as well as network optimization that can integrate and learn information for accurate predictions. The dynamic dendritic information integration mechanism of biological neurons brings us valuable insights for addressing these challenges. In this study, we propose a multi-compartment neuron model capable of nonlinearly integrating information from multiple dendritic sources to dynamically process long sequential inputs. Based on this model, we construct a Spiking World Model (Spiking-WM), to enable model-based deep reinforcement learning (DRL) with SNNs. We evaluated our model using the DeepMind Control Suite, demonstrating that Spiking-WM outperforms existing SNN-based models and achieves performance comparable to artificial neural network (ANN)-based world models employing Gated Recurrent Units (GRUs). Furthermore, we assess the long-term memory capabilities of the proposed model in speech datasets, including SHD, TIMIT, and LibriSpeech 100h, showing that our multi-compartment neuron model surpasses other SNN-based architectures in processing long sequences. Our findings underscore the critical role of dendritic information integration in shaping neuronal function, emphasizing the importance of cooperative dendritic processing in enhancing neural computation.
