Table of Contents
Fetching ...

BitHEP -- The Limits of Low-Precision ML in HEP

Claudius Krause, Daohan Wang, Ramon Winterhalder

TL;DR

This paper evaluates BitNet, a low-precision, quantization-aware neural-network architecture, across three core HEP tasks: quark-gluon tagging, SMEFT parameter estimation, and detector simulation. Using BitLinear layers with binary/ternary weights and 8-bit inputs, it shows competitive performance in quark-gluon classification while revealing nuanced degradation in regression and generative tasks that depends on network size and which layers are quantized. The results highlight the importance of selective quantization, showing that attention-based architectures and larger networks tend to be more resilient, and demonstrate that quantization-aware training can align efficiency gains with accuracy for HL-LHC-scale workloads. The findings motivate further work on heterogeneous, fully quantized pipelines and hardware-specific kernels to enable real-time, energy-efficient ML in high-energy physics.

Abstract

The increasing complexity of modern neural network architectures demands fast and memory-efficient implementations to mitigate computational bottlenecks. In this work, we evaluate the recently proposed BitNet architecture in HEP applications, assessing its performance in classification, regression, and generative modeling tasks. Specifically, we investigate its suitability for quark-gluon discrimination, SMEFT parameter estimation, and detector simulation, comparing its efficiency and accuracy to state-of-the-art methods. Our results show that while BitNet consistently performs competitively in classification tasks, its performance in regression and generation varies with the size and type of the network, highlighting key limitations and potential areas for improvement.

BitHEP -- The Limits of Low-Precision ML in HEP

TL;DR

This paper evaluates BitNet, a low-precision, quantization-aware neural-network architecture, across three core HEP tasks: quark-gluon tagging, SMEFT parameter estimation, and detector simulation. Using BitLinear layers with binary/ternary weights and 8-bit inputs, it shows competitive performance in quark-gluon classification while revealing nuanced degradation in regression and generative tasks that depends on network size and which layers are quantized. The results highlight the importance of selective quantization, showing that attention-based architectures and larger networks tend to be more resilient, and demonstrate that quantization-aware training can align efficiency gains with accuracy for HL-LHC-scale workloads. The findings motivate further work on heterogeneous, fully quantized pipelines and hardware-specific kernels to enable real-time, energy-efficient ML in high-energy physics.

Abstract

The increasing complexity of modern neural network architectures demands fast and memory-efficient implementations to mitigate computational bottlenecks. In this work, we evaluate the recently proposed BitNet architecture in HEP applications, assessing its performance in classification, regression, and generative modeling tasks. Specifically, we investigate its suitability for quark-gluon discrimination, SMEFT parameter estimation, and detector simulation, comparing its efficiency and accuracy to state-of-the-art methods. Our results show that while BitNet consistently performs competitively in classification tasks, its performance in regression and generation varies with the size and type of the network, highlighting key limitations and potential areas for improvement.

Paper Structure

This paper contains 14 sections, 7 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Illustration of the BitLinear layer.
  • Figure 2: Calibration curves of P-DAT and P-DAT-Bit models for quark/gluon discrimination. The errorband has been obtained from evaluating 5 independent runs.
  • Figure 3: Two-dimensional scatter plots of the true and predicted angle $\phi_{\text{decay}}$ in the test dataset. Left: SMEFTNet with regular linear layers. Right: SMEFTNet-Bit with BitLinear layers.
  • Figure 4: Histograms of residuals, defined as the differences between truths and predictions for $\phi_\text{decay}$ regression task. The histogram in blue represents SMEFTNet without BitLinear layers, whereas the histogram in red corresponds to SMEFTNet with BitLinear layers. From left to right: 100%, 70%, and 30% of the weights are quantized.
  • Figure 5: Illustration of different levels of quantization for the flow. Blue lines represent regular linear layers, while green lines indicate BitLinear layers. The three setups, from top to bottom, are: regular, NNCentral, and BlockCentral.