Energy-Efficient Deep Reinforcement Learning with Spiking Transformers
Mohammad Irfan Uddin, Nishad Tasnim, Md Omor Faruk, Zejian Zhou
TL;DR
The paper tackles the energy burden of Transformer-based deep RL in long-horizon tasks by marrying spiking neural networks with Transformer-style sequence modeling. It introduces STRL, a Spike-Transformer Reinforcement Learning architecture that replaces dense activations with multi-step Leaky Integrate-and-Fire neurons inside Transformer blocks to enable energy-efficient, spike-driven sequence processing for offline RL. On large maze navigation datasets, STRL achieves a test accuracy of $99.64\%$ and significantly outperforms the Decision Transformer at $79.82\%$, while offering potential energy advantages on neuromorphic hardware. This work demonstrates a viable path to deploy bio-inspired, low-cost sequential decision-makers in energy-constrained autonomous systems and real-time robotics.
Abstract
Agent-based Transformers have been widely adopted in recent reinforcement learning advances due to their demonstrated ability to solve complex tasks. However, the high computational complexity of Transformers often results in significant energy consumption, limiting their deployment in real-world autonomous systems. Spiking neural networks (SNNs), with their biologically inspired structure, offer an energy-efficient alternative for machine learning. In this paper, a novel Spike-Transformer Reinforcement Learning (STRL) algorithm that combines the energy efficiency of SNNs with the powerful decision-making capabilities of reinforcement learning is developed. Specifically, an SNN using multi-step Leaky Integrate-and-Fire (LIF) neurons and attention mechanisms capable of processing spatio-temporal patterns over multiple time steps is designed. The architecture is further enhanced with state, action, and reward encodings to create a Transformer-like structure optimized for reinforcement learning tasks. Comprehensive numerical experiments conducted on state-of-the-art benchmarks demonstrate that the proposed SNN Transformer achieves significantly improved policy performance compared to conventional agent-based Transformers. With both enhanced energy efficiency and policy optimality, this work highlights a promising direction for deploying bio-inspired, low-cost machine learning models in complex real-world decision-making scenarios.
