BarraCUDA: Edge GPUs do Leak DNN Weights
Peter Horvath, Lukasz Chmielewski, Leo Weissbart, Lejla Batina, Yuval Yarom
TL;DR
BarraCUDA reveals that correlation electromagnetic analysis can extract neural-network weights and biases from edge GPUs running TensorRT, even in noisy, highly parallel environments. The authors build a profiling workflow to identify leakage of partial sums during convolution, then apply a CUDA-accelerated CEMA attack to recover FP16 and INT8 parameters on Nvidia Jetson devices (Nano and Orin Nano). They reverse-engineer TensorRT kernels, localize leakage points using TVLA, and demonstrate parameter extraction on real networks (EfficientNet-B0 variants), with substantial trace requirements but feasible timelines for well-resourced adversaries. The results highlight significant IP-security risks for edge AI deployments and motivate mitigations such as shielding, randomization of multiplication order, and masking to raise the difficulty of weight recovery in practical settings.
Abstract
Over the last decade, applications of neural networks (NNs) have spread to various aspects of our lives. A large number of companies base their businesses on building products that use neural networks for tasks such as face recognition, machine translation, and self-driving cars. Much of the intellectual property underpinning these products is encoded in the exact parameters of the neural networks. Consequently, protecting these is of utmost priority to businesses. At the same time, many of these products need to operate under a strong threat model, in which the adversary has unfettered physical control of the product. In this work, we present BarraCUDA, a novel attack on general purpose Graphic Processing Units (GPUs) that can extract parameters of neural networks running on the popular Nvidia Jetson Nano device. BarraCUDA uses correlation electromagnetic analysis to recover parameters of real-world convolutional neural networks.
