Table of Contents
Fetching ...

EvGNN: An Event-driven Graph Neural Network Accelerator for Edge Vision

Yufeng Yang, Adrian Kneip, Charlotte Frenkel

TL;DR

EvGNN addresses the need for ultra-low-latency, energy-efficient edge vision by enabling end-to-end hardware acceleration of event-driven GNNs. It introduces directed dynamic graphs, a spatiotemporally decoupled prism neighbor search, and layer-parallel execution to perform per-event GNN updates with minimal memory and latency. The approach achieves 87.8% accuracy on N-CARS with an average per-event latency of 16 μs on a Xilinx KV260, and a scalable ASIC-like model projects sub-10 μs latency and low energy for future edge implementations. By contrasting with synchronous GNN accelerators, EvGNN demonstrates how event-driven, local updates can deliver real-time vision intelligence at the edge with compact hardware footprints.

Abstract

Edge vision systems combining sensing and embedded processing promise low-latency, decentralized, and energy-efficient solutions that forgo reliance on the cloud. As opposed to conventional frame-based vision sensors, event-based cameras deliver a microsecond-scale temporal resolution with sparse information encoding, thereby outlining new opportunities for edge vision systems. However, mainstream algorithms for frame-based vision, which mostly rely on convolutional neural networks (CNNs), can hardly exploit the advantages of event-based vision as they are typically optimized for dense matrix-vector multiplications. While event-driven graph neural networks (GNNs) have recently emerged as a promising solution for sparse event-based vision, their irregular structure is a challenge that currently hinders the design of efficient hardware accelerators. In this paper, we propose EvGNN, the first event-driven GNN accelerator for low-footprint, ultra-low-latency, and high-accuracy edge vision with event-based cameras. It relies on three central ideas: (i) directed dynamic graphs exploiting single-hop nodes with edge-free storage, (ii) event queues for the efficient identification of local neighbors within a spatiotemporally decoupled search range, and (iii) a novel layer-parallel processing scheme allowing for a low-latency execution of multi-layer GNNs. We deployed EvGNN on a Xilinx KV260 Ultrascale+ MPSoC platform and benchmarked it on the N-CARS dataset for car recognition, demonstrating a classification accuracy of 87.8% and an average latency per event of 16$μ$s, thereby enabling real-time, microsecond-resolution event-based vision at the edge.

EvGNN: An Event-driven Graph Neural Network Accelerator for Edge Vision

TL;DR

EvGNN addresses the need for ultra-low-latency, energy-efficient edge vision by enabling end-to-end hardware acceleration of event-driven GNNs. It introduces directed dynamic graphs, a spatiotemporally decoupled prism neighbor search, and layer-parallel execution to perform per-event GNN updates with minimal memory and latency. The approach achieves 87.8% accuracy on N-CARS with an average per-event latency of 16 μs on a Xilinx KV260, and a scalable ASIC-like model projects sub-10 μs latency and low energy for future edge implementations. By contrasting with synchronous GNN accelerators, EvGNN demonstrates how event-driven, local updates can deliver real-time vision intelligence at the edge with compact hardware footprints.

Abstract

Edge vision systems combining sensing and embedded processing promise low-latency, decentralized, and energy-efficient solutions that forgo reliance on the cloud. As opposed to conventional frame-based vision sensors, event-based cameras deliver a microsecond-scale temporal resolution with sparse information encoding, thereby outlining new opportunities for edge vision systems. However, mainstream algorithms for frame-based vision, which mostly rely on convolutional neural networks (CNNs), can hardly exploit the advantages of event-based vision as they are typically optimized for dense matrix-vector multiplications. While event-driven graph neural networks (GNNs) have recently emerged as a promising solution for sparse event-based vision, their irregular structure is a challenge that currently hinders the design of efficient hardware accelerators. In this paper, we propose EvGNN, the first event-driven GNN accelerator for low-footprint, ultra-low-latency, and high-accuracy edge vision with event-based cameras. It relies on three central ideas: (i) directed dynamic graphs exploiting single-hop nodes with edge-free storage, (ii) event queues for the efficient identification of local neighbors within a spatiotemporally decoupled search range, and (iii) a novel layer-parallel processing scheme allowing for a low-latency execution of multi-layer GNNs. We deployed EvGNN on a Xilinx KV260 Ultrascale+ MPSoC platform and benchmarked it on the N-CARS dataset for car recognition, demonstrating a classification accuracy of 87.8% and an average latency per event of 16s, thereby enabling real-time, microsecond-resolution event-based vision at the edge.
Paper Structure (31 sections, 8 equations, 14 figures, 6 tables)

This paper contains 31 sections, 8 equations, 14 figures, 6 tables.

Figures (14)

  • Figure 1: Illustration of a typical event GNN pipeline. The colors of nodes represent their features. After the graph convolution, the features are updated (color changed). The graph pooling layer, where certain nodes are selected and edges are re-arranged, simplifies the graph structure through node down-sampling. The graph readout layer divides the graph into several grids, whose cells are assigned with node features. Finally, after flattening, the cell features are processed by the FC layer to derive the graph-level prediction results.
  • Figure 2: Illustration of graph convolution, consisting of three steps: message generation, aggregation, and feature update. In the graph, each node $i$ has a feature vector. First, every node generates its message (colored numbers) by the differentiable function $\phi()$ (message generation, here illustrated with a simple replication operation). Next, messages are exchanged through the edges and nodes aggregate the messages they receive (aggregation, illustrated with a max-value operation). Note that the message from $A$ cannot be aggregated by $D$ due to the directed edge. Finally, the aggregated features are transformed by $\gamma()$, generating the new layer's feature vectors (feature update, illustrated with a replication operation).
  • Figure 3: Static and dynamic event graphs generated from the same event stream. (a) The whole event stream is first transformed into a static event graph, then uses a GNN to provide a prediction result. (b) Whenever a new event is generated (shown in blue), the dynamic event graph is updated and then processed by a GNN, thereby generating an updated prediction result on a low-latency, per-event basis.
  • Figure 4: Event-driven K-layer GNN processing schemes. (a) Naive scheme -- A dynamic event graph, with the blue dot representing the new event node, is processed entirely for each layer of a conventional GNN. The colored nodes represent the updated features along the GNN, while uncolored ones depict unchanged features. (b) AEGNN scheme aegnn -- The same event graph is processed within a $k$-hop subgraph by the $k^\text{th}$ layer of the event-driven GNN, only involving nodes whose features will be effectively updated. (c) HUGNet scheme hugnet -- A directed dynamic event graph implies a causal flow of information, hence only the 1-hop subgraph is needed as the input of each layer, because the features of neighboring nodes remain untouched.
  • Figure 5: Proposed event-driven GNN processing pipeline. (a) A new input event (blue) in an event stream, and its neighboring past events, are processed together in an event-driven fashion. (b) The new event searches potential neighbors in the blue causal prism region, and connects with them by directed edges, thus constructing the event graph (Section \ref{['sect:graph_build_codesign']}). (c) The updated event graph is processed by a multilayer GNN (Section \ref{['sect:network_opt']}), where only (i) the 1-hop subgraph containing the new event is processed, and (ii) the features of the new node are updated. (d) The GNN updates the graph-level prediction.
  • ...and 9 more figures