Hard-Label Cryptanalytic Extraction of Neural Network Models

Yi Chen; Xiaoyang Dong; Jian Guo; Yantian Shen; Anyu Wang; Xiaoyun Wang

Hard-Label Cryptanalytic Extraction of Neural Network Models

Yi Chen, Xiaoyang Dong, Jian Guo, Yantian Shen, Anyu Wang, Xiaoyun Wang

TL;DR

This paper proposes the first attack that theoretically achieves functionally equivalent extraction under the hard-label setting, which applies to ReLU neural networks and is validated through practical experiments on a wide range of ReLU neural networks.

Abstract

The machine learning problem of extracting neural network parameters has been proposed for nearly three decades. Functionally equivalent extraction is a crucial goal for research on this problem. When the adversary has access to the raw output of neural networks, various attacks, including those presented at CRYPTO 2020 and EUROCRYPT 2024, have successfully achieved this goal. However, this goal is not achieved when neural networks operate under a hard-label setting where the raw output is inaccessible. In this paper, we propose the first attack that theoretically achieves functionally equivalent extraction under the hard-label setting, which applies to ReLU neural networks. The effectiveness of our attack is validated through practical experiments on a wide range of ReLU neural networks, including neural networks trained on two real benchmarking datasets (MNIST, CIFAR10) widely used in computer vision. For a neural network consisting of $10^5$ parameters, our attack only requires several hours on a single core.

Hard-Label Cryptanalytic Extraction of Neural Network Models

TL;DR

Abstract

parameters, our attack only requires several hours on a single core.

Paper Structure (58 sections, 2 theorems, 49 equations, 4 figures, 2 tables)

This paper contains 58 sections, 2 theorems, 49 equations, 4 figures, 2 tables.

Introduction
Functionally Equivalent Extraction.
Hard-label Setting.
Our Results and Techniques
Results.
Techniques.
Organization.
Preliminaries
Basic Definitions and Notations
Adversarial Goals and Assumptions
Assumptions.
Auxiliary Concepts
Model Activation Pattern
Special 'neuron'.
Model Signature
...and 43 more sections

Key Result

lemma thmcounterlemma

Based on the system of linear equations presented in Eq. eq:soe-of-k-nn, for $i \in \{2, \cdots, k+1\}$ and $j \in \{1, \cdots, d_i\}$, the extracted weight vector $\widehat{A}_j^{(i)} = \left[ \widehat{w}_{j,1}^{(i)}, \cdots, \widehat{w}_{j, d_{i-1}}^{(i)}\right]$ is where $C_{v, 1}^{(q)} = A_{v}^{(q)} A^{(q-1)} \cdots A^{(2)} \left[ A_{1, 1}^{(1)}, \cdots, A_{d_1, 1}^{(1)} \right]^\top$ .

Figures (4)

Figure 1: Left: the critical point $x = [x_1, x_2, x_3]^{\top}$ makes the output of one neuron (e.g., the solid black circle) $0$. Right: the decision boundary point $x^{\prime} = [x_1^{\prime}, x_2^{\prime}, x_3^{\prime}]^{\top}$ makes the output of the neural network $0$.
Figure 2: The core idea of recovering the weight vector of the $j$-th neuron in layer $i$. Let $x = [x_1, \cdots, x_{d_0}]^\top$ be a decision boundary point with $\mathcal{P}^{(i-1)} = 2^{d_{i-1}} - 1$, $\mathcal{P}^{(i)} = 2^{j-1}$, i.e., in layer $i$, only the $j$-th neuron (the red hollow circle) is active, and in layer $i-1$, all the neurons are active. The first $i-1$ layers have been extracted, and collapse into one layer. All the layers starting from layer $i+1$ collapse into a direct connection between the $j$-th neuron in layer $i$ and the final output.
Figure 3: A schematic diagram of finding decision boundary points. The blue solid line stands for the decision hyperplane composed of decision boundary points. The red dashed line stands for a direction vector $\varDelta \in \mathbb{R}^{d_0}$. The starting point $x \in \mathbb{R}^{d_0}$ (i.e., the solid black circle) moves along the direction $\varDelta$, and arrives at $x + s \times \varDelta$ (i.e., the hollow black circle) where $s \in \mathbb{R}$ is the moving stride.
Figure 4: Left: the victim model $f_{\theta}$. Right: the extracted model $f_{\widehat{\theta}}$.

Theorems & Definitions (21)

Definition 1: Extended Functionally Equivalent Extraction
Definition 2: Extended $(\varepsilon, \delta)$-Functional Equivalence
Definition 3: $k$-Deep Neural Network DBLP:conf/crypto/CarliniJM20
Definition 4: Fully Connected Layer DBLP:conf/crypto/CarliniJM20
Definition 5: Neuron DBLP:conf/crypto/CarliniJM20
Definition 6: Neuron State DBLP:journals/iacr/CanalesMartinezCHRSS23
Definition 7: Neural Network Architecture DBLP:conf/crypto/CarliniJM20
Definition 8: Neural Network Parameters DBLP:conf/crypto/CarliniJM20
Definition 9: Hard-Label
Definition 10: Model Parameter Extraction Attack
...and 11 more

Hard-Label Cryptanalytic Extraction of Neural Network Models

TL;DR

Abstract

Hard-Label Cryptanalytic Extraction of Neural Network Models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (21)