Fast inference of deep neural networks in FPGAs for particle physics
Javier Duarte, Song Han, Philip Harris, Sergo Jindariani, Edward Kreinar, Benjamin Kreis, Jennifer Ngadiuba, Maurizio Pierini, Ryan Rivera, Nhan Tran, Zhenbin Wu
TL;DR
The paper demonstrates that neural networks can be implemented on FPGAs for LHC real-time triggers using a companion tool, hls4ml, to translate trained models into FPGA-implementable firmware. Through a jet substructure case study, it maps resource usage and latency across network architectures, reveals that aggressive compression and fixed-point quantization yield sub-100 ns inference on modern FPGAs while preserving performance. It shows that reduced precision and parameter pruning dramatically cut DSP usage, enabling practical deployment within L1 trigger budgets, and provides insights into the fidelity of HLS-based resource estimates versus final implementations. The work argues for broad applicability of this approach to physics and beyond, with future plans to extend to CNNs, RNNs, and other FPGA platforms.
Abstract
Recent results at the Large Hadron Collider (LHC) have pointed to enhanced physics capabilities through the improvement of the real-time event processing techniques. Machine learning methods are ubiquitous and have proven to be very powerful in LHC physics, and particle physics as a whole. However, exploration of the use of such techniques in low-latency, low-power FPGA hardware has only just begun. FPGA-based trigger and data acquisition (DAQ) systems have extremely low, sub-microsecond latency requirements that are unique to particle physics. We present a case study for neural network inference in FPGAs focusing on a classifier for jet substructure which would enable, among many other physics scenarios, searches for new dark sector particles and novel measurements of the Higgs boson. While we focus on a specific example, the lessons are far-reaching. We develop a package based on High-Level Synthesis (HLS) called hls4ml to build machine learning models in FPGAs. The use of HLS increases accessibility across a broad user community and allows for a drastic decrease in firmware development time. We map out FPGA resource usage and latency versus neural network hyperparameters to identify the problems in particle physics that would benefit from performing neural network inference with FPGAs. For our example jet substructure model, we fit well within the available resources of modern FPGAs with a latency on the scale of 100 ns.
