Neural Architecture Codesign for Fast Physics Applications

Jason Weitz; Dmitri Demler; Luke McDermott; Nhan Tran; Javier Duarte

Neural Architecture Codesign for Fast Physics Applications

Jason Weitz, Dmitri Demler, Luke McDermott, Nhan Tran, Javier Duarte

TL;DR

This work introduces Neural Architecture Codesign (NAC), a two-stage, hardware-aware framework that extends neural architecture search to optimize both accuracy and FPGA-friendly efficiency for physics tasks. By coupling a global search with a local search that includes training optimization and aggressive compression, NAC identifies architectures that balance performance with hardware constraints and then synthesizes them into FPGA implementations via hls4ml. The authors demonstrate the approach on two case studies—Bragg peak finding in materials science and jet tagging in high-energy physics—achieving significant reductions in bit operations and resource usage while preserving or improving accuracy, and delivering very low-latency FPGA deployments. The framework emphasizes modular search spaces, Pareto-based optimization, and open-source tooling to enable broad applicability across domains with limited ML expertise.

Abstract

We develop a pipeline to streamline neural architecture codesign for physics applications to reduce the need for ML expertise when designing models for novel tasks. Our method employs neural architecture search and network compression in a two-stage approach to discover hardware efficient models. This approach consists of a global search stage that explores a wide range of architectures while considering hardware constraints, followed by a local search stage that fine-tunes and compresses the most promising candidates. We exceed performance on various tasks and show further speedup through model compression techniques such as quantization-aware-training and neural network pruning. We synthesize the optimal models to high level synthesis code for FPGA deployment with the hls4ml library. Additionally, our hierarchical search space provides greater flexibility in optimization, which can easily extend to other tasks and domains. We demonstrate this with two case studies: Bragg peak finding in materials science and jet classification in high energy physics, achieving models with improved accuracy, smaller latencies, or reduced resource utilization relative to the baseline models.

Neural Architecture Codesign for Fast Physics Applications

TL;DR

Abstract

Paper Structure (23 sections, 5 equations, 6 figures, 8 tables)

This paper contains 23 sections, 5 equations, 6 figures, 8 tables.

Introduction
Related Work
Neural Architecture Search
Model Compression
Hardware Implementation and Synthesis
Method
Global Search
Search space
Search and Evaluation
Local Search
Training Optimization
Model compression
Model FPGA Synthesis
Bragg Peak Case Study
Method Adaptations
...and 8 more sections

Figures (6)

Figure 1: Full pipeline methodology of neural architecture codesign containing the two stages: global and local search. Global search stage explores a wide range of architectures and local search further fine-tunes hyperparameters and applies compression techniques. We then synthesize the optimal models to high level synthesis code for FPGA deployment.
Figure 2: Bragg peak NAC visualization. The input consists of 11$\times$11 pixel patches centered on Bragg peaks from X-ray diffraction patterns. The neural network predicts the $x$ and $y$ coordinates of the peak center within each patch.
Figure 3: Our automated pipeline for neural architecture search for the Bragg peak dataset. Yellow components are human inputs, white are outputs, and orange are search processes. The right side demonstrates the template of each candidate architecture in our search space. Each subcomponent of the blocks also contains the hyperparameters to optimize.
Figure 4: Bragg peak dataset Pareto-optimal front for global (left) and local search (right). The global search contains 1000 trials with the selected models in orange. For the local search, each chosen model is quantized to various bit precisions, and then pruned by 20% for 20 iterations indicated by increasing opacity. The first 10 iterations are displayed for visualization purposes.
Figure 5: Particle jet NAC visualization. The input consists of sets of up to 8 constituent particles from a jet, with the transverse momentum ($p_\mathrm{T}$), pseudorapidity ($\eta$), and azimuthal angle ($\phi$) known for each particle. The neural network predicts the origin of the jet (light quark, gluon, W boson, Z boson, or top quark).
...and 1 more figures

Neural Architecture Codesign for Fast Physics Applications

TL;DR

Abstract

Neural Architecture Codesign for Fast Physics Applications

Authors

TL;DR

Abstract

Table of Contents

Figures (6)