Table of Contents
Fetching ...

Spiking Convolutional Neural Networks for Text Classification

Changze Lv, Jianhan Xu, Xiaoqing Zheng

TL;DR

It is shown empirically that after fine-tuning with surrogate gradients, the converted SNNs achieve comparable results to their DNN counterparts with much less energy consumption across multiple datasets for both English and Chinese.

Abstract

Spiking neural networks (SNNs) offer a promising pathway to implement deep neural networks (DNNs) in a more energy-efficient manner since their neurons are sparsely activated and inferences are event-driven. However, there have been very few works that have demonstrated the efficacy of SNNs in language tasks partially because it is non-trivial to represent words in the forms of spikes and to deal with variable-length texts by SNNs. This work presents a "conversion + fine-tuning" two-step method for training SNNs for text classification and proposes a simple but effective way to encode pre-trained word embeddings as spike trains. We show empirically that after fine-tuning with surrogate gradients, the converted SNNs achieve comparable results to their DNN counterparts with much less energy consumption across multiple datasets for both English and Chinese. We also show that such SNNs are more robust to adversarial attacks than DNNs.

Spiking Convolutional Neural Networks for Text Classification

TL;DR

It is shown empirically that after fine-tuning with surrogate gradients, the converted SNNs achieve comparable results to their DNN counterparts with much less energy consumption across multiple datasets for both English and Chinese.

Abstract

Spiking neural networks (SNNs) offer a promising pathway to implement deep neural networks (DNNs) in a more energy-efficient manner since their neurons are sparsely activated and inferences are event-driven. However, there have been very few works that have demonstrated the efficacy of SNNs in language tasks partially because it is non-trivial to represent words in the forms of spikes and to deal with variable-length texts by SNNs. This work presents a "conversion + fine-tuning" two-step method for training SNNs for text classification and proposes a simple but effective way to encode pre-trained word embeddings as spike trains. We show empirically that after fine-tuning with surrogate gradients, the converted SNNs achieve comparable results to their DNN counterparts with much less energy consumption across multiple datasets for both English and Chinese. We also show that such SNNs are more robust to adversarial attacks than DNNs.
Paper Structure (25 sections, 11 equations, 5 figures, 5 tables, 1 algorithm)

This paper contains 25 sections, 11 equations, 5 figures, 5 tables, 1 algorithm.

Figures (5)

  • Figure 1: An illustration of a two-step method (conversion + fine-tuning) for training spiking neural networks for text classification: initialize an SNN with the weights of a tailored network trained with the gradient descent, and perform backpropagation with surrogate gradients on the converted SNN. The tailored network is obtained by replacing the max-pooling operation with average-pooling, the Sigmoid activation function with ReLU, and the word embeddings with their positive equivalents.
  • Figure 2: Computational steps in training SNNs by the generalized backpropagation with surrogate gradients. (a) A recurrent representation of a leaky integrate-and-fire (LIF) neuron. (b) An unrolled computational graph of the LIF neuron where time flows from left to right.
  • Figure 3: The impact of hyper-parameters. (a) Accuracy versus the number of neurons used per category. (b) Accuracy versus the decay rate of $\beta$. (c) and (d) Accuracy and the proportion of active neurons influenced by different values of membrane thresholds $U_{\text{thr}}$ on SST-$2$ and ChnSenti datasets.
  • Figure 4: (a) Classification accuracy versus the number of epochs used to fine-tune SNNs (b) and (c) Accuracy and the proportions of active neurons influenced by different values of membrane thresholds $U_{\text{thr}}$ on MR and Subj datasets.
  • Figure 5: Classification accuracy versus the number of time steps. (a) The accuracy achieved by the fine-tuned SNNs with various time steps at the inference time on the test sets of MR, Subj, ChnSenti, and Waimai datasets. (b) The accuracy achieved by the SNNs with and without the fine-tuning on two English text classification benchmarks. (c) The accuracy achieved by the SNNs with and without the fine-tuning on two Chinese text classification datasets.