JEDI-linear: Fast and Efficient Graph Neural Networks for Jet Tagging on FPGAs
Zhiqiang Que, Chang Sun, Sudarshan Paramesvaran, Emyr Clement, Katerina Karakoulaki, Christopher Brown, Lauri Laatu, Arianna Cox, Alexander Tapper, Wayne Luk, Maria Spiropulu
TL;DR
The paper addresses the real-time jet tagging challenge in the CMS HL-LHC Level-1 trigger, where traditional GNNs face prohibitive edge-computation costs. It introduces JEDI-linear, a linear-complexity GNN variant that replaces explicit pairwise interactions with a global information gathering mechanism based on an affine edge function, enabling $\\mathcal{O}(N_O)$ scaling. The authors couple this architecture with fine-grained quantization-aware training and multiplier-free distributed arithmetic to realize DSP-free, on-chip FPGA implementations, validated through automated end-to-end hardware generation. Empirical results show sub-60 ns latency, zero DSP usage, and improved accuracy compared to prior GNNs, making real-time deployment feasible and scalable; the work also provides open-source templates for broader adoption. Collectively, this work demonstrates that careful algorithm-hardware co-design can unlock powerful GNN inference in stringent real-time scientific settings and offers transferable design templates for other domains.
Abstract
Graph Neural Networks (GNNs), particularly Interaction Networks (INs), have shown exceptional performance for jet tagging at the CERN High-Luminosity Large Hadron Collider (HL-LHC). However, their computational complexity and irregular memory access patterns pose significant challenges for deployment on FPGAs in hardware trigger systems, where strict latency and resource constraints apply. In this work, we propose JEDI-linear, a novel GNN architecture with linear computational complexity that eliminates explicit pairwise interactions by leveraging shared transformations and global aggregation. To further enhance hardware efficiency, we introduce fine-grained quantization-aware training with per-parameter bitwidth optimization and employ multiplier-free multiply-accumulate operations via distributed arithmetic. Evaluation results show that our FPGA-based JEDI-linear achieves 3.7 to 11.5 times lower latency, up to 150 times lower initiation interval, and up to 6.2 times lower LUT usage compared to state-of-the-art GNN designs while also delivering higher model accuracy and eliminating the need for DSP blocks entirely. This is the first interaction-based GNN to achieve less than 60~ns latency and currently meets the requirements for use in the HL-LHC CMS Level-1 trigger system. This work advances the next-generation trigger systems by enabling accurate, scalable, and resource-efficient GNN inference in real-time environments. Our open-sourced templates will further support reproducibility and broader adoption across scientific applications.
