Table of Contents
Fetching ...

BarraCUDA: Edge GPUs do Leak DNN Weights

Peter Horvath, Lukasz Chmielewski, Leo Weissbart, Lejla Batina, Yuval Yarom

TL;DR

BarraCUDA reveals that correlation electromagnetic analysis can extract neural-network weights and biases from edge GPUs running TensorRT, even in noisy, highly parallel environments. The authors build a profiling workflow to identify leakage of partial sums during convolution, then apply a CUDA-accelerated CEMA attack to recover FP16 and INT8 parameters on Nvidia Jetson devices (Nano and Orin Nano). They reverse-engineer TensorRT kernels, localize leakage points using TVLA, and demonstrate parameter extraction on real networks (EfficientNet-B0 variants), with substantial trace requirements but feasible timelines for well-resourced adversaries. The results highlight significant IP-security risks for edge AI deployments and motivate mitigations such as shielding, randomization of multiplication order, and masking to raise the difficulty of weight recovery in practical settings.

Abstract

Over the last decade, applications of neural networks (NNs) have spread to various aspects of our lives. A large number of companies base their businesses on building products that use neural networks for tasks such as face recognition, machine translation, and self-driving cars. Much of the intellectual property underpinning these products is encoded in the exact parameters of the neural networks. Consequently, protecting these is of utmost priority to businesses. At the same time, many of these products need to operate under a strong threat model, in which the adversary has unfettered physical control of the product. In this work, we present BarraCUDA, a novel attack on general purpose Graphic Processing Units (GPUs) that can extract parameters of neural networks running on the popular Nvidia Jetson Nano device. BarraCUDA uses correlation electromagnetic analysis to recover parameters of real-world convolutional neural networks.

BarraCUDA: Edge GPUs do Leak DNN Weights

TL;DR

BarraCUDA reveals that correlation electromagnetic analysis can extract neural-network weights and biases from edge GPUs running TensorRT, even in noisy, highly parallel environments. The authors build a profiling workflow to identify leakage of partial sums during convolution, then apply a CUDA-accelerated CEMA attack to recover FP16 and INT8 parameters on Nvidia Jetson devices (Nano and Orin Nano). They reverse-engineer TensorRT kernels, localize leakage points using TVLA, and demonstrate parameter extraction on real networks (EfficientNet-B0 variants), with substantial trace requirements but feasible timelines for well-resourced adversaries. The results highlight significant IP-security risks for edge AI deployments and motivate mitigations such as shielding, randomization of multiplication order, and masking to raise the difficulty of weight recovery in practical settings.

Abstract

Over the last decade, applications of neural networks (NNs) have spread to various aspects of our lives. A large number of companies base their businesses on building products that use neural networks for tasks such as face recognition, machine translation, and self-driving cars. Much of the intellectual property underpinning these products is encoded in the exact parameters of the neural networks. Consequently, protecting these is of utmost priority to businesses. At the same time, many of these products need to operate under a strong threat model, in which the adversary has unfettered physical control of the product. In this work, we present BarraCUDA, a novel attack on general purpose Graphic Processing Units (GPUs) that can extract parameters of neural networks running on the popular Nvidia Jetson Nano device. BarraCUDA uses correlation electromagnetic analysis to recover parameters of real-world convolutional neural networks.
Paper Structure (38 sections, 3 equations, 9 figures, 3 tables)

This paper contains 38 sections, 3 equations, 9 figures, 3 tables.

Figures (9)

  • Figure 1: High-level description of the attack procedure.
  • Figure 2: EM probe locations for the Jetson Nano and Jetson Orin Nano devices for successful parameter extraction attacks.
  • Figure 3: Raw trace of the whole operation on the GPU of Jetson Nano. The two convolutional layers (light pink and yellow) are clearly separated in the traces. Additionally, the CUDA device-to-host memory copy (light purple) is also clearly visible in the end of the trace.
  • Figure 4: Result of fixed vs. random TVLA for the first weight (top) and the bias (bottom) in FP16 convolution with 30K and 37K traces, respectively. The middle depicts an example trace. The dashed red line indicates the 4.5 threshold.
  • Figure 5: Key ranks and correlations of the different FP16 weights in the first and second layer on the Jetson Nano.
  • ...and 4 more figures