Table of Contents
Fetching ...

Kraken: Higher-order EM Side-Channel Attacks on DNNs in Near and Far Field

Peter Horvath, Ilia Shumailov, Lukasz Chmielewski, Lejla Batina, Yuval Yarom

TL;DR

This work demonstrates parameter extraction on the specialized GPU's Tensor Core units, most commonly used GPU units nowadays due to their superior performance, via near-field physical side-channel attacks via Correlation Power Analysis (CPA).

Abstract

The multi-million dollar investment required for modern machine learning (ML) has made large ML models a prime target for theft. In response, the field of model stealing has emerged. Attacks based on physical side-channel information have shown that DNN model extraction is feasible, even on CUDA Cores in a GPU. For the first time, our work demonstrates parameter extraction on the specialized GPU's Tensor Core units, most commonly used GPU units nowadays due to their superior performance, via near-field physical side-channel attacks. Previous work targeted only the general-purpose CUDA Cores in the GPU, the functional units that have been part of the GPU since its inception. Our method is tailored to the GPU architecture to accurately estimate energy consumption and derive efficient attacks via Correlation Power Analysis (CPA). Furthermore, we provide an exploratory analysis of hyperparameter and weight leakage from LLMs in far field and demonstrate that the GPU's electromagnetic radiation leaks even 100\,cm away through a glass obstacle.

Kraken: Higher-order EM Side-Channel Attacks on DNNs in Near and Far Field

TL;DR

This work demonstrates parameter extraction on the specialized GPU's Tensor Core units, most commonly used GPU units nowadays due to their superior performance, via near-field physical side-channel attacks via Correlation Power Analysis (CPA).

Abstract

The multi-million dollar investment required for modern machine learning (ML) has made large ML models a prime target for theft. In response, the field of model stealing has emerged. Attacks based on physical side-channel information have shown that DNN model extraction is feasible, even on CUDA Cores in a GPU. For the first time, our work demonstrates parameter extraction on the specialized GPU's Tensor Core units, most commonly used GPU units nowadays due to their superior performance, via near-field physical side-channel attacks. Previous work targeted only the general-purpose CUDA Cores in the GPU, the functional units that have been part of the GPU since its inception. Our method is tailored to the GPU architecture to accurately estimate energy consumption and derive efficient attacks via Correlation Power Analysis (CPA). Furthermore, we provide an exploratory analysis of hyperparameter and weight leakage from LLMs in far field and demonstrate that the GPU's electromagnetic radiation leaks even 100\,cm away through a glass obstacle.
Paper Structure (36 sections, 8 equations, 17 figures)

This paper contains 36 sections, 8 equations, 17 figures.

Figures (17)

  • Figure 1: High-level description of the attacks presented in the paper. First, a floorplan is made of the target chip to precisely identify the GPU SMs on the die. Second, near-field probing is used to test and validate the hypotheses of our leakage models, including warp-level and higher-order attacks. The approaches in the near field are validated on convolutional layers. Lastly, we provide an exploratory analysis of the leakage of different hyperparameters and the weights of an LLM in the far field using the derived leakage models in the near field.
  • Figure 2: Image of the full die on the left and a partial infrared die shot on the right. The highlighted area with the rectangle on the right shows 2 Streaming Multiprocessors out of 16 of the GPU die.
  • Figure 3: Zoomed in image of the two SMs with annotations. The partitioned L1 cache/Shared Memories are in the upper-left and upper-right corners. More importantly, each SM contains 4 sub-partitions, each with its own Tensor Core unit. We place the EM probe directly above or in the vicinity of one sub-partition.
  • Figure 4: Example of collected raw trace of a convolutional layer implemented on Tensor Cores. The vertical red lines highlight the Points-of-Interest (PoI) used for weight extraction.
  • Figure 5: Warp-level correlation of eighth weight in the first kernel.
  • ...and 12 more figures