Table of Contents
Fetching ...

Hardware-Accelerated Event-Graph Neural Networks for Low-Latency Time-Series Classification on SoC FPGA

Hiroshi Nakano, Krzysztof Blachut, Kamil Jeziorek, Piotr Wzorek, Manon Dampfhoffer, Thomas Mesquida, Hiroaki Nishi, Tomasz Kryjak, Thomas Dalgaty

TL;DR

This work presents a hardware-accelerated, asynchronous event-graph neural network for low-latency time-series classification on a SoC FPGA, using an artificial cochlea to convert signals into sparse events. A novel skip-step graph generator and PN-normalized Graph Convolution enable efficient hardware mapping, leading to a fully embedded SHD classifier with high accuracy and low resource usage. On a Zynq UltraScale+ platform, the approach achieves state-of-the-art FPGA performance on SHD, with per-event latency around 8 µs and total system power near 3.94 W, while using far fewer parameters than competing SNN-based solutions. The results demonstrate the viability of event-graph models for edge AI and lay groundwork for future end-to-end, continuously streamed, real-sensor deployments and alternative architectures like LSTMs.

Abstract

As the quantities of data recorded by embedded edge sensors grow, so too does the need for intelligent local processing. Such data often comes in the form of time-series signals, based on which real-time predictions can be made locally using an AI model. However, a hardware-software approach capable of making low-latency predictions with low power consumption is required. In this paper, we present a hardware implementation of an event-graph neural network for time-series classification. We leverage an artificial cochlea model to convert the input time-series signals into a sparse event-data format that allows the event-graph to drastically reduce the number of calculations relative to other AI methods. We implemented the design on a SoC FPGA and applied it to the real-time processing of the Spiking Heidelberg Digits (SHD) dataset to benchmark our approach against competitive solutions. Our method achieves a floating-point accuracy of 92.7% on the SHD dataset for the base model, which is only 2.4% and 2% less than the state-of-the-art models with over 10% and 67% fewer model parameters, respectively. It also outperforms FPGA-based spiking neural network implementations by 19.3% and 4.5%, achieving 92.3% accuracy for the quantised model while using fewer computational resources and reducing latency.

Hardware-Accelerated Event-Graph Neural Networks for Low-Latency Time-Series Classification on SoC FPGA

TL;DR

This work presents a hardware-accelerated, asynchronous event-graph neural network for low-latency time-series classification on a SoC FPGA, using an artificial cochlea to convert signals into sparse events. A novel skip-step graph generator and PN-normalized Graph Convolution enable efficient hardware mapping, leading to a fully embedded SHD classifier with high accuracy and low resource usage. On a Zynq UltraScale+ platform, the approach achieves state-of-the-art FPGA performance on SHD, with per-event latency around 8 µs and total system power near 3.94 W, while using far fewer parameters than competing SNN-based solutions. The results demonstrate the viability of event-graph models for edge AI and lay groundwork for future end-to-end, continuously streamed, real-sensor deployments and alternative architectures like LSTMs.

Abstract

As the quantities of data recorded by embedded edge sensors grow, so too does the need for intelligent local processing. Such data often comes in the form of time-series signals, based on which real-time predictions can be made locally using an AI model. However, a hardware-software approach capable of making low-latency predictions with low power consumption is required. In this paper, we present a hardware implementation of an event-graph neural network for time-series classification. We leverage an artificial cochlea model to convert the input time-series signals into a sparse event-data format that allows the event-graph to drastically reduce the number of calculations relative to other AI methods. We implemented the design on a SoC FPGA and applied it to the real-time processing of the Spiking Heidelberg Digits (SHD) dataset to benchmark our approach against competitive solutions. Our method achieves a floating-point accuracy of 92.7% on the SHD dataset for the base model, which is only 2.4% and 2% less than the state-of-the-art models with over 10% and 67% fewer model parameters, respectively. It also outperforms FPGA-based spiking neural network implementations by 19.3% and 4.5%, achieving 92.3% accuracy for the quantised model while using fewer computational resources and reducing latency.

Paper Structure

This paper contains 18 sections, 4 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Overview of our hardware-accelerated event-graph neural network implementation. Dotted modules indicate event-by-event operation. For testing, we read data from an SD card and simulate real intervals between events.
  • Figure 2: Spectro-Temporal Spike Rasters from the SHD dataset.
  • Figure 3: Overview of the graph generation and convolution modules. The skip step is set to 10 and $r_{ch}$ is set to 100. The number of output features is denoted as $n$.
  • Figure 4: Model complexity analysis in term of FLOPs. The figures illustrate not only the significant impact of model size on computational complexity, but also the parameters of the graph generator.