Table of Contents
Fetching ...

Language Modeling on a SpiNNaker 2 Neuromorphic Chip

Khaleelulla Khan Nazeer, Mark Schöne, Rishav Mukherji, Bernhard Vogginger, Christian Mayr, David Kappel, Anand Subramoney

TL;DR

This work tackles the energy demands of large language models by demonstrating a language model implemented on the SpiNNaker2 neuromorphic chip using the event-based EGRU unit. It uses three sparsely connected EGRU layers with 95% weight pruning stored in CSR and evaluates on WikiText-2, plus a DVS gesture recognition task, showing energy-efficient, event-driven processing on hardware designed for sparse communication. The results indicate sub-watt inference energy and competitive performance relative to LSTMs, illustrating the feasibility of neuromorphic language processing and gesture recognition and outlining practical bottlenecks and future directions. The study suggests that future gains from quantization and multi-chip scaling could bring neuromorphic NLP closer to edge-ready deployment for larger models and real-time applications.

Abstract

As large language models continue to scale in size rapidly, so too does the computational power required to run them. Event-based networks on neuromorphic devices offer a potential way to reduce energy consumption for inference significantly. However, to date, most event-based networks that can run on neuromorphic hardware, including spiking neural networks (SNNs), have not achieved task performance even on par with LSTM models for language modeling. As a result, language modeling on neuromorphic devices has seemed a distant prospect. In this work, we demonstrate the first-ever implementation of a language model on a neuromorphic device - specifically the SpiNNaker 2 chip - based on a recently published event-based architecture called the EGRU. SpiNNaker 2 is a many-core neuromorphic chip designed for large-scale asynchronous processing, while the EGRU is architected to leverage such hardware efficiently while maintaining competitive task performance. This implementation marks the first time a neuromorphic language model matches LSTMs, setting the stage for taking task performance to the level of large language models. We also demonstrate results on a gesture recognition task based on inputs from a DVS camera. Overall, our results showcase the feasibility of this neuro-inspired neural network in hardware, highlighting significant gains versus conventional hardware in energy efficiency for the common use case of single batch inference.

Language Modeling on a SpiNNaker 2 Neuromorphic Chip

TL;DR

This work tackles the energy demands of large language models by demonstrating a language model implemented on the SpiNNaker2 neuromorphic chip using the event-based EGRU unit. It uses three sparsely connected EGRU layers with 95% weight pruning stored in CSR and evaluates on WikiText-2, plus a DVS gesture recognition task, showing energy-efficient, event-driven processing on hardware designed for sparse communication. The results indicate sub-watt inference energy and competitive performance relative to LSTMs, illustrating the feasibility of neuromorphic language processing and gesture recognition and outlining practical bottlenecks and future directions. The study suggests that future gains from quantization and multi-chip scaling could bring neuromorphic NLP closer to edge-ready deployment for larger models and real-time applications.

Abstract

As large language models continue to scale in size rapidly, so too does the computational power required to run them. Event-based networks on neuromorphic devices offer a potential way to reduce energy consumption for inference significantly. However, to date, most event-based networks that can run on neuromorphic hardware, including spiking neural networks (SNNs), have not achieved task performance even on par with LSTM models for language modeling. As a result, language modeling on neuromorphic devices has seemed a distant prospect. In this work, we demonstrate the first-ever implementation of a language model on a neuromorphic device - specifically the SpiNNaker 2 chip - based on a recently published event-based architecture called the EGRU. SpiNNaker 2 is a many-core neuromorphic chip designed for large-scale asynchronous processing, while the EGRU is architected to leverage such hardware efficiently while maintaining competitive task performance. This implementation marks the first time a neuromorphic language model matches LSTMs, setting the stage for taking task performance to the level of large language models. We also demonstrate results on a gesture recognition task based on inputs from a DVS camera. Overall, our results showcase the feasibility of this neuro-inspired neural network in hardware, highlighting significant gains versus conventional hardware in energy efficiency for the common use case of single batch inference.
Paper Structure (16 sections, 3 equations, 2 figures, 3 tables, 1 algorithm)

This paper contains 16 sections, 3 equations, 2 figures, 3 tables, 1 algorithm.

Figures (2)

  • Figure 1: EGRU operations and distribution strategy: This figure shows computation performed on a single PE as part of a multi-core implementation, the grayed out portions are computed on other PEs
  • Figure 2: Profiling of internal EGRU operation for Language modeling. EGRU model has 1350 units, 3 layers and 95% pruned. Weights stored in sparse CSR format.