Table of Contents
Fetching ...

Fault Injection and Safe-Error Attack for Extraction of Embedded Neural Network Models

Kevin Hector, Pierre-Alain Moellic, Mathieu Dumont, Jean-Max Dutertre

TL;DR

This work focuses on embedded deep neural network models on 32-bit microcontrollers, a widespread family of hardware platforms in IoT, and the use of a standard fault injection strategy - Safe Error Attack (SEA) - to perform a model extraction attack with an adversary having a limited access to training data.

Abstract

Model extraction emerges as a critical security threat with attack vectors exploiting both algorithmic and implementation-based approaches. The main goal of an attacker is to steal as much information as possible about a protected victim model, so that he can mimic it with a substitute model, even with a limited access to similar training data. Recently, physical attacks such as fault injection have shown worrying efficiency against the integrity and confidentiality of embedded models. We focus on embedded deep neural network models on 32-bit microcontrollers, a widespread family of hardware platforms in IoT, and the use of a standard fault injection strategy - Safe Error Attack (SEA) - to perform a model extraction attack with an adversary having a limited access to training data. Since the attack strongly depends on the input queries, we propose a black-box approach to craft a successful attack set. For a classical convolutional neural network, we successfully recover at least 90% of the most significant bits with about 1500 crafted inputs. These information enable to efficiently train a substitute model, with only 8% of the training dataset, that reaches high fidelity and near identical accuracy level than the victim model.

Fault Injection and Safe-Error Attack for Extraction of Embedded Neural Network Models

TL;DR

This work focuses on embedded deep neural network models on 32-bit microcontrollers, a widespread family of hardware platforms in IoT, and the use of a standard fault injection strategy - Safe Error Attack (SEA) - to perform a model extraction attack with an adversary having a limited access to training data.

Abstract

Model extraction emerges as a critical security threat with attack vectors exploiting both algorithmic and implementation-based approaches. The main goal of an attacker is to steal as much information as possible about a protected victim model, so that he can mimic it with a substitute model, even with a limited access to similar training data. Recently, physical attacks such as fault injection have shown worrying efficiency against the integrity and confidentiality of embedded models. We focus on embedded deep neural network models on 32-bit microcontrollers, a widespread family of hardware platforms in IoT, and the use of a standard fault injection strategy - Safe Error Attack (SEA) - to perform a model extraction attack with an adversary having a limited access to training data. Since the attack strongly depends on the input queries, we propose a black-box approach to craft a successful attack set. For a classical convolutional neural network, we successfully recover at least 90% of the most significant bits with about 1500 crafted inputs. These information enable to efficiently train a substitute model, with only 8% of the training dataset, that reaches high fidelity and near identical accuracy level than the victim model.
Paper Structure (19 sections, 1 equation, 5 figures, 8 tables)

This paper contains 19 sections, 1 equation, 5 figures, 8 tables.

Figures (5)

  • Figure 1: The adversary crafts inputs and performs a safe-error attack exploiting faulted predictions with bit-set fault injections on the parameters stored in memory. The objective is to partially recover the bits of the parameters to efficiently train a substitute model that mimics the victim model with high fidelity.
  • Figure 2: $\nabla_{w}\mathcal{L}$ distribution per layer for Certain and Uncertain inputs. The boxplot represents the median value inside the first and third quartiles. Blue lines extend the box by 1.5x and black circles are outliers.
  • Figure 3: Illustration of the LSBL principle.
  • Figure 4: Bits recovered with SEA and LSBL for (left) CNN and (right) MLP (random inputs only).
  • Figure 5: Distribution of recovered bits w.r.t. number of inputs.