Table of Contents
Fetching ...

SpikeLM: Towards General Spike-Driven Language Modeling via Elastic Bi-Spiking Mechanisms

Xingrun Xing, Zheng Zhang, Ziyi Ni, Shitao Xiao, Yiming Ju, Siqi Fan, Yequan Wang, Jiajun Zhang, Guoqi Li

TL;DR

SpikeLM tackles the challenge of fully spike-driven language modeling by introducing elastic bi-spiking that augments spikes with direction, frequency, and amplitude information. The method enables fully spike-driven transformers to handle both discriminative and generative tasks while maintaining addition-like computations. The authors provide theoretical support via dynamical isometry and demonstrate substantial accuracy improvements and energy savings compared with prior SNNs and some LIF baselines on GLUE and generation benchmarks. This work suggests a viable path toward energy-efficient, brain-inspired language models and motivates further scaling and weight-quantization research.

Abstract

Towards energy-efficient artificial intelligence similar to the human brain, the bio-inspired spiking neural networks (SNNs) have advantages of biological plausibility, event-driven sparsity, and binary activation. Recently, large-scale language models exhibit promising generalization capability, making it a valuable issue to explore more general spike-driven models. However, the binary spikes in existing SNNs fail to encode adequate semantic information, placing technological challenges for generalization. This work proposes the first fully spiking mechanism for general language tasks, including both discriminative and generative ones. Different from previous spikes with {0,1} levels, we propose a more general spike formulation with bi-directional, elastic amplitude, and elastic frequency encoding, while still maintaining the addition nature of SNNs. In a single time step, the spike is enhanced by direction and amplitude information; in spike frequency, a strategy to control spike firing rate is well designed. We plug this elastic bi-spiking mechanism in language modeling, named SpikeLM. It is the first time to handle general language tasks with fully spike-driven models, which achieve much higher accuracy than previously possible. SpikeLM also greatly bridges the performance gap between SNNs and ANNs in language modeling. Our code is available at https://github.com/Xingrun-Xing/SpikeLM.

SpikeLM: Towards General Spike-Driven Language Modeling via Elastic Bi-Spiking Mechanisms

TL;DR

SpikeLM tackles the challenge of fully spike-driven language modeling by introducing elastic bi-spiking that augments spikes with direction, frequency, and amplitude information. The method enables fully spike-driven transformers to handle both discriminative and generative tasks while maintaining addition-like computations. The authors provide theoretical support via dynamical isometry and demonstrate substantial accuracy improvements and energy savings compared with prior SNNs and some LIF baselines on GLUE and generation benchmarks. This work suggests a viable path toward energy-efficient, brain-inspired language models and motivates further scaling and weight-quantization research.

Abstract

Towards energy-efficient artificial intelligence similar to the human brain, the bio-inspired spiking neural networks (SNNs) have advantages of biological plausibility, event-driven sparsity, and binary activation. Recently, large-scale language models exhibit promising generalization capability, making it a valuable issue to explore more general spike-driven models. However, the binary spikes in existing SNNs fail to encode adequate semantic information, placing technological challenges for generalization. This work proposes the first fully spiking mechanism for general language tasks, including both discriminative and generative ones. Different from previous spikes with {0,1} levels, we propose a more general spike formulation with bi-directional, elastic amplitude, and elastic frequency encoding, while still maintaining the addition nature of SNNs. In a single time step, the spike is enhanced by direction and amplitude information; in spike frequency, a strategy to control spike firing rate is well designed. We plug this elastic bi-spiking mechanism in language modeling, named SpikeLM. It is the first time to handle general language tasks with fully spike-driven models, which achieve much higher accuracy than previously possible. SpikeLM also greatly bridges the performance gap between SNNs and ANNs in language modeling. Our code is available at https://github.com/Xingrun-Xing/SpikeLM.
Paper Structure (22 sections, 22 equations, 8 figures, 9 tables)

This paper contains 22 sections, 22 equations, 8 figures, 9 tables.

Figures (8)

  • Figure 1: Comparisons between previous spike encoding (a) and our elastic bidirectional encodings (b, c, d). The bidirectional, frequency and amplitude encodings are sequentially applied .
  • Figure 2: Spike firing rate in LIF-BERT (left) and activated rate in Binary BERT (right) in every linear layer.
  • Figure 3: Relationship between the variance of input distributions and the spike firing frequencies.
  • Figure 4: SNN scaling law of SpikeLM (T=1,4).
  • Figure 5: Spiking firing rate in linear layers under 2 settings: the learnable thresholds $\bm{\alpha}(t)$ (a) and spike frequency encoding (b, c, d).
  • ...and 3 more figures

Theorems & Definitions (2)

  • proof
  • proof