Physics at the Edge: Benchmarking Quantisation Techniques and the Edge TPU for Neutrino Interaction Recognition
Stefano Vergani, Hilary Utaegbulam, Michael Wang, Leigh H. Whitehead, Arden Tsang, Lorenzo Uboldi
Abstract
This work presents a comprehensive benchmark of different quantisation techniques for convolutional neural networks applied to neutrino interaction recognition. Utilising simulation for a generic liquid argon time-projection chamber, models are quantised and then deployed on the Google Coral Edge TPU. Four Keras models are tested, and accuracy is measured across two different pipelines: using post-training integer quantisation and quantisation-aware training. Inference speed is benchmarked against an AMD EPYC 7763 CPU and NVIDIA A100 GPU. A study of the energy consumption is also presented, with attention to potential costs and environmental issues. Results show that, among the four models tested, accuracy degradation is limited and, in particular, Inception V3 presents almost no accuracy degradation across the two quantisation and deployment pipelines. The speed of the edge TPU is comparable to that of the CPU, and one order of magnitude slower than the GPU. Moreover, the energy consumption of all models deployed on the edge TPU is several orders of magnitude lower than that of the CPU and GPU. In the energy consumption-latency parameter space, CPU, GPU, and edge TPU performances can be clearly separated. This paper explores possible future integrations of edge AI technologies with neutrino physics.
