Make Shuffling Great Again: A Side-Channel Resistant Fisher-Yates Algorithm for Protecting Neural Networks

Leonard Puškáč; Marek Benovič; Jakub Breier; Xiaolu Hou

Make Shuffling Great Again: A Side-Channel Resistant Fisher-Yates Algorithm for Protecting Neural Networks

Leonard Puškáč, Marek Benovič, Jakub Breier, Xiaolu Hou

TL;DR

The paper addresses the vulnerability of software-based shuffling in neural networks to side-channel attacks by proposing a side-channel resistant Fisher-Yates shuffle that uses masking for modular reduction and Blakely's modular multiplication. The core contribution is a protected shuffling scheme that conceals the shuffle index and eliminates division-related leakage, enabling secure parameter protection on general-purpose MCUs. Empirical CPA experiments on an ARM Cortex-M4 show that the protected shuffle prevents reliable weight recovery, while unprotected implementations remain susceptible, even with large numbers of traces; the memory overhead is $2\times$ the largest layer and time overhead ranges from $4\%$ to $0.49\%$ as layer size grows. The work demonstrates a practical, hardware-agnostic defense suitable for embedded NN deployments and lays groundwork for combining hiding with masking in future work.

Abstract

Neural network models implemented in embedded devices have been shown to be susceptible to side-channel attacks (SCAs), allowing recovery of proprietary model parameters, such as weights and biases. There are already available countermeasure methods currently used for protecting cryptographic implementations that can be tailored to protect embedded neural network models. Shuffling, a hiding-based countermeasure that randomly shuffles the order of computations, was shown to be vulnerable to SCA when the Fisher-Yates algorithm is used. In this paper, we propose a design of an SCA-secure version of the Fisher-Yates algorithm. By integrating the masking technique for modular reduction and Blakely's method for modular multiplication, we effectively remove the vulnerability in the division operation that led to side-channel leakage in the original version of the algorithm. We experimentally evaluate that the countermeasure is effective against SCA by implementing a correlation power analysis attack on an embedded neural network model implemented on ARM Cortex-M4. Compared to the original proposal, the memory overhead is $2\times$ the biggest layer of the network, while the time overhead varies from $4\%$ to $0.49\%$ for a layer with $100$ and $1000$ neurons, respectively.

Make Shuffling Great Again: A Side-Channel Resistant Fisher-Yates Algorithm for Protecting Neural Networks

TL;DR

the largest layer and time overhead ranges from

as layer size grows. The work demonstrates a practical, hardware-agnostic defense suitable for embedded NN deployments and lays groundwork for combining hiding with masking in future work.

Abstract

the biggest layer of the network, while the time overhead varies from

for a layer with

and

neurons, respectively.

Paper Structure (20 sections, 23 equations, 13 figures, 4 algorithms)

This paper contains 20 sections, 23 equations, 13 figures, 4 algorithms.

Introduction
Background
Side-Channel Attacks
Side-Channel Countermeasures
Related Work
SCAs on Neural Networks
Countermeasures against SCAs on Neural Networks
Countermeasure Proposal
Blakely's method for modular multiplication
Masked Shuffling
Security of the implementation against the attack in ganesan2023blackjack
Shuffling multiplication
Experimental Evaluation
CPA Attack Steps
Experimental Setup
...and 5 more sections

Figures (13)

Figure 1: A flow chart depiction of the proposed protected version of the Fisher-Yates algorithm.
Figure 2: Power traces corresponding to the computation of the first hidden layer in (a) unprotected and (b) protected implementations. The durations of each neuron computations are clearly distinguishable in both cases as indicated by red dotted lines.
Figure 3: Power traces for the first neuron computation in the first hidden layer for (a) unprotected and (b) protected implementations. In both cases, the durations of multiplication operations are distinguishable (red dotted lines). However, in the unprotected network, the first multiplication (time samples $490$–$1010$) corresponds to the first input neuron, while in the protected case, this correspondence is obscured.
Figure 4: CPA attack results for the unprotected implementation. The $y$-axis represents the absolute correlation. The red lines correspond to the correct values associated with the correct weight of $1.43$, while the gray lines correspond to incorrect values.
Figure 5: CPA attack results for the protected implementation. The $y$-axis represents the absolute correlation. The red lines correspond to the correct values associated with the correct weight of $1.43$, while the gray lines correspond to incorrect values.
...and 8 more figures

Make Shuffling Great Again: A Side-Channel Resistant Fisher-Yates Algorithm for Protecting Neural Networks

TL;DR

Abstract

Make Shuffling Great Again: A Side-Channel Resistant Fisher-Yates Algorithm for Protecting Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (13)