Table of Contents
Fetching ...

Spiking World Model with Multi-Compartment Neurons for Model-based Reinforcement Learning

Yinqian Sun, Feifei Zhao, Mingyang Lv, Yi Zeng

TL;DR

This work tackles model-based reinforcement learning with spiking neural networks by introducing a multi-compartment neuron (MCN) that performs nonlinear dendritic integration across apical and basal dendrites. Building on this, the Spiking-WM framework implements a fully spiking Dreamer-like world model, using an MCN-based state-space representation and spiking encoders/decoders for both visual and sequential data. Across DeepMind Control tasks and long-sequence speech benchmarks (SHD, TIMIT, LibriSpeech 100h), Spiking-WM achieves competitive performance with GRU-based, ANN-driven world models and surpasses existing SNN approaches, highlighting the importance of dendritic computations for memory and decision-making in SNNs. The results suggest that cooperative dendritic processing can yield robust, energy-efficient models for model-based RL, with broad implications for biologically plausible AI and neuromorphic computing, and are supported by extensive analyses of membrane dynamics and spiking activity. All code and models are available via the BrainCog Embot platform, facilitating further research in dendrite-aware SNNs for sequential decision tasks.

Abstract

Brain-inspired spiking neural networks (SNNs) have garnered significant research attention in algorithm design and perception applications. However, their potential in the decision-making domain, particularly in model-based reinforcement learning, remains underexplored. The difficulty lies in the need for spiking neurons with long-term temporal memory capabilities, as well as network optimization that can integrate and learn information for accurate predictions. The dynamic dendritic information integration mechanism of biological neurons brings us valuable insights for addressing these challenges. In this study, we propose a multi-compartment neuron model capable of nonlinearly integrating information from multiple dendritic sources to dynamically process long sequential inputs. Based on this model, we construct a Spiking World Model (Spiking-WM), to enable model-based deep reinforcement learning (DRL) with SNNs. We evaluated our model using the DeepMind Control Suite, demonstrating that Spiking-WM outperforms existing SNN-based models and achieves performance comparable to artificial neural network (ANN)-based world models employing Gated Recurrent Units (GRUs). Furthermore, we assess the long-term memory capabilities of the proposed model in speech datasets, including SHD, TIMIT, and LibriSpeech 100h, showing that our multi-compartment neuron model surpasses other SNN-based architectures in processing long sequences. Our findings underscore the critical role of dendritic information integration in shaping neuronal function, emphasizing the importance of cooperative dendritic processing in enhancing neural computation.

Spiking World Model with Multi-Compartment Neurons for Model-based Reinforcement Learning

TL;DR

This work tackles model-based reinforcement learning with spiking neural networks by introducing a multi-compartment neuron (MCN) that performs nonlinear dendritic integration across apical and basal dendrites. Building on this, the Spiking-WM framework implements a fully spiking Dreamer-like world model, using an MCN-based state-space representation and spiking encoders/decoders for both visual and sequential data. Across DeepMind Control tasks and long-sequence speech benchmarks (SHD, TIMIT, LibriSpeech 100h), Spiking-WM achieves competitive performance with GRU-based, ANN-driven world models and surpasses existing SNN approaches, highlighting the importance of dendritic computations for memory and decision-making in SNNs. The results suggest that cooperative dendritic processing can yield robust, energy-efficient models for model-based RL, with broad implications for biologically plausible AI and neuromorphic computing, and are supported by extensive analyses of membrane dynamics and spiking activity. All code and models are available via the BrainCog Embot platform, facilitating further research in dendrite-aware SNNs for sequential decision tasks.

Abstract

Brain-inspired spiking neural networks (SNNs) have garnered significant research attention in algorithm design and perception applications. However, their potential in the decision-making domain, particularly in model-based reinforcement learning, remains underexplored. The difficulty lies in the need for spiking neurons with long-term temporal memory capabilities, as well as network optimization that can integrate and learn information for accurate predictions. The dynamic dendritic information integration mechanism of biological neurons brings us valuable insights for addressing these challenges. In this study, we propose a multi-compartment neuron model capable of nonlinearly integrating information from multiple dendritic sources to dynamically process long sequential inputs. Based on this model, we construct a Spiking World Model (Spiking-WM), to enable model-based deep reinforcement learning (DRL) with SNNs. We evaluated our model using the DeepMind Control Suite, demonstrating that Spiking-WM outperforms existing SNN-based models and achieves performance comparable to artificial neural network (ANN)-based world models employing Gated Recurrent Units (GRUs). Furthermore, we assess the long-term memory capabilities of the proposed model in speech datasets, including SHD, TIMIT, and LibriSpeech 100h, showing that our multi-compartment neuron model surpasses other SNN-based architectures in processing long sequences. Our findings underscore the critical role of dendritic information integration in shaping neuronal function, emphasizing the importance of cooperative dendritic processing in enhancing neural computation.

Paper Structure

This paper contains 14 sections, 26 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: The spiking world model based on MCN. (a) Biological structure of pyramidal neurons (left) and the multi-compartment neuron model (right). (b) The architecture of spiking world model.
  • Figure 2: Evaluation scores for DeepMind visual control experiments.
  • Figure 3: Comparison of visual state predictions for different neuron models.
  • Figure 4: Experimental results on the impact of MCN parameters on model performance. (a) Comparison results of experiments on MCN with and without learnable time decay parameters. (b) The distibution of soma, basal dendrite and apical dendrite time parameters after learning. (c) Grid search of basal dendrite conductance parameters and apical dendrite gate parameter on SHD dataset. (d) Grid search of basal dendrite conductance parameters and apical dendrite gate parameter on Walker Walk.
  • Figure 5: Statistics of dendritic membrane potential and spike firing activity of all multi-compartment neurons at task Walker Run predition process. (a) The apical dendritical potential of MCNs. (b) The basal dendritical potential of MCNs. (c) The spike sequential of MCNs during the model prediction process. (d) Details of the dendritic membrane potential and spike firing of the 200th neuron.