Table of Contents
Fetching ...

Investigating Resource-efficient Neutron/Gamma Classification ML Models Targeting eFPGAs

Jyothisraj Johnson, Billy Boxer, Tarun Prakash, Carl Grace, Peter Sorensen, Mani Tripathi

TL;DR

The paper addresses real-time neutron/gamma discrimination in radiation-detection contexts by evaluating resource-efficient ML models on open-source eFPGA fabrics. It combines a hardware-aware co-design workflow with two model families, BDT and fcNN, to map to eFPGA fabrics while accounting for ADC quantization and front-end constraints; experiments on a Stilbene-SiPM dataset show that BDTs can achieve neutron efficiency >$95\%$ at gamma leakage $10^{-3}$ with modest resource usage, and fcNNs can achieve similar performance at higher resource cost. Key findings indicate that BDTs require no DSP/BRAM and have lower latencies, making them attractive for initial test chips, whereas fcNNs demand more DSP/BRAM but can still meet performance targets with careful quantization and pruning. The work informs the specification of an eFPGA fabric for a test chip and demonstrates the value of hardware-aware model design in resource-limited environments, with practical implications for integrating ML into detector readout and real-time decision systems.

Abstract

There has been considerable interest and resulting progress in implementing machine learning (ML) models in hardware over the last several years from the particle and nuclear physics communities. A big driver has been the release of the Python package, hls4ml, which has enabled porting models specified and trained using Python ML libraries to register transfer level (RTL) code. So far, the primary end targets have been commercial FPGAs or synthesized custom blocks on ASICs. However, recent developments in open-source embedded FPGA (eFPGA) frameworks now provide an alternate, more flexible pathway for implementing ML models in hardware. These customized eFPGA fabrics can be integrated as part of an overall chip design. In general, the decision between a fully custom, eFPGA, or commercial FPGA ML implementation will depend on the details of the end-use application. In this work, we explored the parameter space for eFPGA implementations of fully-connected neural network (fcNN) and boosted decision tree (BDT) models using the task of neutron/gamma classification with a specific focus on resource efficiency. We used data collected using an AmBe sealed source incident on Stilbene, which was optically coupled to an OnSemi J-series SiPM to generate training and test data for this study. We investigated relevant input features and the effects of bit-resolution and sampling rate as well as trade-offs in hyperparameters for both ML architectures while tracking total resource usage. The performance metric used to track model performance was the calculated neutron efficiency at a gamma leakage of 10$^{-3}$. The results of the study will be used to aid the specification of an eFPGA fabric, which will be integrated as part of a test chip.

Investigating Resource-efficient Neutron/Gamma Classification ML Models Targeting eFPGAs

TL;DR

The paper addresses real-time neutron/gamma discrimination in radiation-detection contexts by evaluating resource-efficient ML models on open-source eFPGA fabrics. It combines a hardware-aware co-design workflow with two model families, BDT and fcNN, to map to eFPGA fabrics while accounting for ADC quantization and front-end constraints; experiments on a Stilbene-SiPM dataset show that BDTs can achieve neutron efficiency > at gamma leakage with modest resource usage, and fcNNs can achieve similar performance at higher resource cost. Key findings indicate that BDTs require no DSP/BRAM and have lower latencies, making them attractive for initial test chips, whereas fcNNs demand more DSP/BRAM but can still meet performance targets with careful quantization and pruning. The work informs the specification of an eFPGA fabric for a test chip and demonstrates the value of hardware-aware model design in resource-limited environments, with practical implications for integrating ML into detector readout and real-time decision systems.

Abstract

There has been considerable interest and resulting progress in implementing machine learning (ML) models in hardware over the last several years from the particle and nuclear physics communities. A big driver has been the release of the Python package, hls4ml, which has enabled porting models specified and trained using Python ML libraries to register transfer level (RTL) code. So far, the primary end targets have been commercial FPGAs or synthesized custom blocks on ASICs. However, recent developments in open-source embedded FPGA (eFPGA) frameworks now provide an alternate, more flexible pathway for implementing ML models in hardware. These customized eFPGA fabrics can be integrated as part of an overall chip design. In general, the decision between a fully custom, eFPGA, or commercial FPGA ML implementation will depend on the details of the end-use application. In this work, we explored the parameter space for eFPGA implementations of fully-connected neural network (fcNN) and boosted decision tree (BDT) models using the task of neutron/gamma classification with a specific focus on resource efficiency. We used data collected using an AmBe sealed source incident on Stilbene, which was optically coupled to an OnSemi J-series SiPM to generate training and test data for this study. We investigated relevant input features and the effects of bit-resolution and sampling rate as well as trade-offs in hyperparameters for both ML architectures while tracking total resource usage. The performance metric used to track model performance was the calculated neutron efficiency at a gamma leakage of 10. The results of the study will be used to aid the specification of an eFPGA fabric, which will be integrated as part of a test chip.
Paper Structure (9 sections, 2 equations, 16 figures)

This paper contains 9 sections, 2 equations, 16 figures.

Figures (16)

  • Figure 1: A system diagram illustrating the general signal progression when analog circuits are used to calculate input feature values with ADCs digitizing those values on top and when a waveform digitizer and digital signal processing block are used on the bottom.
  • Figure 2: The relationship between ADC bit resolution and the number of bits required in fixed point format to represent values in a base unit of or over a 2V full-scale range. The top figure shows the mapping for and the bottom, for .
  • Figure 3: Diagram of the bench top test bed used to acquire the waveform dataset used in this work. Figure is replicated from Boxer_2023.
  • Figure 4: The neutron and gamma events used for this study are plotted as a function of PSD ratio vs. total deposition energy. A partial integration window of 128ns and total integration window of 1500ns is used. An energy cut at 90keVee is included. Afterwards, upper and lower threshold cuts were defined to exclude events in between the neutron and gamma bands to maintain purity of the truth data set.
  • Figure 5: The frequency response of the 2nd-order Butterworth filters used to re-filter the waveforms in software. Corner frequencies were specified corresponding to a 1$\times$, 2$\times$, 4$\times$, 8$\times$, and 16$\times$ downscaling of the actual 125MHz anti-aliasing filter used by the CAEN digitizer.
  • ...and 11 more figures