Make Shuffling Great Again: A Side-Channel Resistant Fisher-Yates Algorithm for Protecting Neural Networks
Leonard Puškáč, Marek Benovič, Jakub Breier, Xiaolu Hou
TL;DR
The paper addresses the vulnerability of software-based shuffling in neural networks to side-channel attacks by proposing a side-channel resistant Fisher-Yates shuffle that uses masking for modular reduction and Blakely's modular multiplication. The core contribution is a protected shuffling scheme that conceals the shuffle index and eliminates division-related leakage, enabling secure parameter protection on general-purpose MCUs. Empirical CPA experiments on an ARM Cortex-M4 show that the protected shuffle prevents reliable weight recovery, while unprotected implementations remain susceptible, even with large numbers of traces; the memory overhead is $2\times$ the largest layer and time overhead ranges from $4\%$ to $0.49\%$ as layer size grows. The work demonstrates a practical, hardware-agnostic defense suitable for embedded NN deployments and lays groundwork for combining hiding with masking in future work.
Abstract
Neural network models implemented in embedded devices have been shown to be susceptible to side-channel attacks (SCAs), allowing recovery of proprietary model parameters, such as weights and biases. There are already available countermeasure methods currently used for protecting cryptographic implementations that can be tailored to protect embedded neural network models. Shuffling, a hiding-based countermeasure that randomly shuffles the order of computations, was shown to be vulnerable to SCA when the Fisher-Yates algorithm is used. In this paper, we propose a design of an SCA-secure version of the Fisher-Yates algorithm. By integrating the masking technique for modular reduction and Blakely's method for modular multiplication, we effectively remove the vulnerability in the division operation that led to side-channel leakage in the original version of the algorithm. We experimentally evaluate that the countermeasure is effective against SCA by implementing a correlation power analysis attack on an embedded neural network model implemented on ARM Cortex-M4. Compared to the original proposal, the memory overhead is $2\times$ the biggest layer of the network, while the time overhead varies from $4\%$ to $0.49\%$ for a layer with $100$ and $1000$ neurons, respectively.
