Table of Contents
Fetching ...

Physics at the Edge: Benchmarking Quantisation Techniques and the Edge TPU for Neutrino Interaction Recognition

Stefano Vergani, Hilary Utaegbulam, Michael Wang, Leigh H. Whitehead, Arden Tsang, Lorenzo Uboldi

Abstract

This work presents a comprehensive benchmark of different quantisation techniques for convolutional neural networks applied to neutrino interaction recognition. Utilising simulation for a generic liquid argon time-projection chamber, models are quantised and then deployed on the Google Coral Edge TPU. Four Keras models are tested, and accuracy is measured across two different pipelines: using post-training integer quantisation and quantisation-aware training. Inference speed is benchmarked against an AMD EPYC 7763 CPU and NVIDIA A100 GPU. A study of the energy consumption is also presented, with attention to potential costs and environmental issues. Results show that, among the four models tested, accuracy degradation is limited and, in particular, Inception V3 presents almost no accuracy degradation across the two quantisation and deployment pipelines. The speed of the edge TPU is comparable to that of the CPU, and one order of magnitude slower than the GPU. Moreover, the energy consumption of all models deployed on the edge TPU is several orders of magnitude lower than that of the CPU and GPU. In the energy consumption-latency parameter space, CPU, GPU, and edge TPU performances can be clearly separated. This paper explores possible future integrations of edge AI technologies with neutrino physics.

Physics at the Edge: Benchmarking Quantisation Techniques and the Edge TPU for Neutrino Interaction Recognition

Abstract

This work presents a comprehensive benchmark of different quantisation techniques for convolutional neural networks applied to neutrino interaction recognition. Utilising simulation for a generic liquid argon time-projection chamber, models are quantised and then deployed on the Google Coral Edge TPU. Four Keras models are tested, and accuracy is measured across two different pipelines: using post-training integer quantisation and quantisation-aware training. Inference speed is benchmarked against an AMD EPYC 7763 CPU and NVIDIA A100 GPU. A study of the energy consumption is also presented, with attention to potential costs and environmental issues. Results show that, among the four models tested, accuracy degradation is limited and, in particular, Inception V3 presents almost no accuracy degradation across the two quantisation and deployment pipelines. The speed of the edge TPU is comparable to that of the CPU, and one order of magnitude slower than the GPU. Moreover, the energy consumption of all models deployed on the edge TPU is several orders of magnitude lower than that of the CPU and GPU. In the energy consumption-latency parameter space, CPU, GPU, and edge TPU performances can be clearly separated. This paper explores possible future integrations of edge AI technologies with neutrino physics.

Paper Structure

This paper contains 22 sections, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Examples of CC $\nu_\mu$, CC $\nu_e$ and NC interactions shown in the left, middle and right panels, respectively. The events are shown in the $w$ readout plane in the $\left(w,x\right)$ parameter space and the colour scale goes from dark blue (low charge) to cyan (high charge).
  • Figure 2: Balanced accuracy evolution across optimisation pipelines. The left panel shows post-training quantisation and deployment on Edge TPU, while the right panel shows quantisation-aware training followed by Edge TPU compilation. Colour palette has been chosen to be distinguishable for people with colour-vision deficiencies petroff2024accessiblecolorsequencesdata.
  • Figure 3: Energy consumed per inference (in mJ) for the four baseline models as a function of the latency. Colour palette has been chosen to be distinguishable for people with colour-vision deficiencies petroff2024accessiblecolorsequencesdata.