Hybrid Spiking Neural Network -- Transformer Video Classification Model
Aaron Bateni
TL;DR
This work tackles time-series classification using a cortical-column-inspired hybrid spiking neural network (SNN) that processes temporal information via spike timing. It combines unsupervised spike-timing-dependent plasticity (STDP) in an initial stage with reward-modulated STDP in a two-class cortical-column final layer, guided by multiple spike-encoding schemes and a two-step input pipeline (video→images→spike trains). The authors present concrete encoding methods, a biologically inspired network with excitatory/inhibitory connections, and dynamic reward strategies to drive learning, validating the approach on a small sentiment-like dataset derived from video-derived text representations and providing public code. The results suggest potential advantages for small-scale, temporal-aware classification with low power and parallel processing benefits, while highlighting training-time challenges that motivate future hardware and algorithmic optimizations.
Abstract
In recent years, Spiking Neural Networks (SNNs) have gathered significant interest due to their temporal understanding capabilities. This work introduces, to the best of our knowledge, the first Cortical Column like hybrid architecture for the Time-Series Data Classification Task that leverages SNNs and is inspired by the brain structure, inspired from the previous hybrid models. We introduce several encoding methods to use with this model. Finally, we develop a procedure for training this network on the training dataset. As an effort to make using these models simpler, we make all the implementations available to the public.
