Table of Contents
Fetching ...

Fast inference of deep neural networks in FPGAs for particle physics

Javier Duarte, Song Han, Philip Harris, Sergo Jindariani, Edward Kreinar, Benjamin Kreis, Jennifer Ngadiuba, Maurizio Pierini, Ryan Rivera, Nhan Tran, Zhenbin Wu

TL;DR

The paper demonstrates that neural networks can be implemented on FPGAs for LHC real-time triggers using a companion tool, hls4ml, to translate trained models into FPGA-implementable firmware. Through a jet substructure case study, it maps resource usage and latency across network architectures, reveals that aggressive compression and fixed-point quantization yield sub-100 ns inference on modern FPGAs while preserving performance. It shows that reduced precision and parameter pruning dramatically cut DSP usage, enabling practical deployment within L1 trigger budgets, and provides insights into the fidelity of HLS-based resource estimates versus final implementations. The work argues for broad applicability of this approach to physics and beyond, with future plans to extend to CNNs, RNNs, and other FPGA platforms.

Abstract

Recent results at the Large Hadron Collider (LHC) have pointed to enhanced physics capabilities through the improvement of the real-time event processing techniques. Machine learning methods are ubiquitous and have proven to be very powerful in LHC physics, and particle physics as a whole. However, exploration of the use of such techniques in low-latency, low-power FPGA hardware has only just begun. FPGA-based trigger and data acquisition (DAQ) systems have extremely low, sub-microsecond latency requirements that are unique to particle physics. We present a case study for neural network inference in FPGAs focusing on a classifier for jet substructure which would enable, among many other physics scenarios, searches for new dark sector particles and novel measurements of the Higgs boson. While we focus on a specific example, the lessons are far-reaching. We develop a package based on High-Level Synthesis (HLS) called hls4ml to build machine learning models in FPGAs. The use of HLS increases accessibility across a broad user community and allows for a drastic decrease in firmware development time. We map out FPGA resource usage and latency versus neural network hyperparameters to identify the problems in particle physics that would benefit from performing neural network inference with FPGAs. For our example jet substructure model, we fit well within the available resources of modern FPGAs with a latency on the scale of 100 ns.

Fast inference of deep neural networks in FPGAs for particle physics

TL;DR

The paper demonstrates that neural networks can be implemented on FPGAs for LHC real-time triggers using a companion tool, hls4ml, to translate trained models into FPGA-implementable firmware. Through a jet substructure case study, it maps resource usage and latency across network architectures, reveals that aggressive compression and fixed-point quantization yield sub-100 ns inference on modern FPGAs while preserving performance. It shows that reduced precision and parameter pruning dramatically cut DSP usage, enabling practical deployment within L1 trigger budgets, and provides insights into the fidelity of HLS-based resource estimates versus final implementations. The work argues for broad applicability of this approach to physics and beyond, with future plans to extend to CNNs, RNNs, and other FPGA platforms.

Abstract

Recent results at the Large Hadron Collider (LHC) have pointed to enhanced physics capabilities through the improvement of the real-time event processing techniques. Machine learning methods are ubiquitous and have proven to be very powerful in LHC physics, and particle physics as a whole. However, exploration of the use of such techniques in low-latency, low-power FPGA hardware has only just begun. FPGA-based trigger and data acquisition (DAQ) systems have extremely low, sub-microsecond latency requirements that are unique to particle physics. We present a case study for neural network inference in FPGAs focusing on a classifier for jet substructure which would enable, among many other physics scenarios, searches for new dark sector particles and novel measurements of the Higgs boson. While we focus on a specific example, the lessons are far-reaching. We develop a package based on High-Level Synthesis (HLS) called hls4ml to build machine learning models in FPGAs. The use of HLS increases accessibility across a broad user community and allows for a drastic decrease in firmware development time. We map out FPGA resource usage and latency versus neural network hyperparameters to identify the problems in particle physics that would benefit from performing neural network inference with FPGAs. For our example jet substructure model, we fit well within the available resources of modern FPGAs with a latency on the scale of 100 ns.

Paper Structure

This paper contains 11 sections, 4 equations, 17 figures, 2 tables.

Figures (17)

  • Figure 1: A typical workflow to translate a model into a FPGA implementation using hls4ml.
  • Figure 2: A cartoon of a deep, fully connected neural network illustrating the description conventions used in the text
  • Figure 3: Example Feynman diagrams of interesting physics signatures that would benefit from jet substructure algorithms in the hardware trigger.
  • Figure 4: Two neural network architectures for jet substructure classification. (Left) A three-hidden-layer model we use to categorize five classes of jets ($q$, $g$, $W$, $Z$, and $t$). (Right) A one-hidden-layer model used to identify top quarks, simplified for the FPGA implementation described in Sec. \ref{['sec:implementation']}.
  • Figure 5: Performance of the deep neural network classifier: (Left) signal efficiency versus mis-identification rate for quark, gluon, $W$ boson, $Z$ boson, and top quark jet identification. The mis-identification rate is based on an equal admixture of the other non-signal jet types. (Right) The corresponding normalized confusion matrix for the five classes.
  • ...and 12 more figures