Table of Contents
Fetching ...

Is the Hard-Label Cryptanalytic Model Extraction Really Polynomial?

Akira Ito, Takayuki Miura, Yosuke Todo

TL;DR

This work questions the hardness guarantees of hard-label model extraction for ReLU DNNs, arguing that persistent and dead neurons disrupt the polynomial-time assumptions claimed by prior cryptanalytic-style attacks. It introduces CrossLayer Extraction to recover persistent neurons by leveraging cross-layer interactions, yielding a best-effort reconstruction that remains correct with high probability under certain conditions. The authors provide both theoretical bounds and empirical evidence showing exponential query requirements can arise in deeper networks and demonstrate a practical cross-layer approach to mitigate these issues. The results illuminate fundamental limits of hard-label extraction and motivate new defenses and attack strategies under realistic noise and depth, with implications for security and IP protection of DNNs.

Abstract

Deep Neural Networks (DNNs) have attracted significant attention, and their internal models are now considered valuable intellectual assets. Extracting these internal models through access to a DNN is conceptually similar to extracting a secret key via oracle access to a block cipher. Consequently, cryptanalytic techniques, particularly differential-like attacks, have been actively explored recently. ReLU-based DNNs are the most commonly and widely deployed architectures. While early works (e.g., Crypto 2020, Eurocrypt 2024) assume access to exact output logits, which are usually invisible, more recent works (e.g., Asiacrypt 2024, Eurocrypt 2025) focus on the hard-label setting, where only the final classification result (e.g., "dog" or "car") is available to the attacker. Notably, Carlini et al. (Eurocrypt 2025) demonstrated that model extraction is feasible in polynomial time even under this restricted setting. In this paper, we first show that the assumptions underlying their attack become increasingly unrealistic as the attack-target depth grows. In practice, satisfying these assumptions requires an exponential number of queries with respect to the attack depth, implying that the attack does not always run in polynomial time. To address this critical limitation, we propose a novel attack method called CrossLayer Extraction. Instead of directly extracting the secret parameters (e.g., weights and biases) of a specific neuron, which incurs exponential cost, we exploit neuron interactions across layers to extract this information from deeper layers. This technique significantly reduces query complexity and mitigates the limitations of existing model extraction approaches.

Is the Hard-Label Cryptanalytic Model Extraction Really Polynomial?

TL;DR

This work questions the hardness guarantees of hard-label model extraction for ReLU DNNs, arguing that persistent and dead neurons disrupt the polynomial-time assumptions claimed by prior cryptanalytic-style attacks. It introduces CrossLayer Extraction to recover persistent neurons by leveraging cross-layer interactions, yielding a best-effort reconstruction that remains correct with high probability under certain conditions. The authors provide both theoretical bounds and empirical evidence showing exponential query requirements can arise in deeper networks and demonstrate a practical cross-layer approach to mitigate these issues. The results illuminate fundamental limits of hard-label extraction and motivate new defenses and attack strategies under realistic noise and depth, with implications for security and IP protection of DNNs.

Abstract

Deep Neural Networks (DNNs) have attracted significant attention, and their internal models are now considered valuable intellectual assets. Extracting these internal models through access to a DNN is conceptually similar to extracting a secret key via oracle access to a block cipher. Consequently, cryptanalytic techniques, particularly differential-like attacks, have been actively explored recently. ReLU-based DNNs are the most commonly and widely deployed architectures. While early works (e.g., Crypto 2020, Eurocrypt 2024) assume access to exact output logits, which are usually invisible, more recent works (e.g., Asiacrypt 2024, Eurocrypt 2025) focus on the hard-label setting, where only the final classification result (e.g., "dog" or "car") is available to the attacker. Notably, Carlini et al. (Eurocrypt 2025) demonstrated that model extraction is feasible in polynomial time even under this restricted setting. In this paper, we first show that the assumptions underlying their attack become increasingly unrealistic as the attack-target depth grows. In practice, satisfying these assumptions requires an exponential number of queries with respect to the attack depth, implying that the attack does not always run in polynomial time. To address this critical limitation, we propose a novel attack method called CrossLayer Extraction. Instead of directly extracting the secret parameters (e.g., weights and biases) of a specific neuron, which incurs exponential cost, we exploit neuron interactions across layers to extract this information from deeper layers. This technique significantly reduces query complexity and mitigates the limitations of existing model extraction approaches.

Paper Structure

This paper contains 43 sections, 2 theorems, 47 equations, 10 figures, 2 tables, 3 algorithms.

Key Result

theorem thmcountertheorem

Let $\lambda_{\min}(\mathbf{Q})$ denote the smallest nonzero eigenvalue of $\mathbf{Q}$. Then

Figures (10)

  • Figure 1: ReLU-based DNN and block cipher
  • Figure 2: Overview of model extraction with hard-label setting.
  • Figure 3: Intersection point search and signature recovery.
  • Figure 4: Layer-wise minimum neuron activation (or inactivation) probabilities.
  • Figure 5: Switching probability versus the proportion of discovered intersection points in the 10th layers of the trained MLPs.
  • ...and 5 more figures

Theorems & Definitions (9)

  • definition thmcounterdefinition: ReLU DBLP:journals/jmlr/GlorotBB11
  • definition thmcounterdefinition: ReLU Network
  • definition thmcounterdefinition: Intersection space and intersection point
  • definition thmcounterdefinition: Signature
  • definition thmcounterdefinition: Dead neuron and persistent neuron
  • theorem thmcountertheorem
  • proof
  • lemma thmcounterlemma
  • proof