Table of Contents
Fetching ...

Improving Location-based Thermal Emission Side-Channel Analysis Using Iterative Transfer Learning

Tun-Chieh Lou, Chung-Che Wang, Jyh-Shing Roger Jang, Henian Li, Lang Lin, Norman Chang

TL;DR

This work tackles data efficiency in deep-learning side-channel attacks on AES-128 by exploiting correlations across the 16 key bytes via iterative transfer learning (ITL). It uses preprocessing with Laplacian filtering and per-byte feature selection to construct compact inputs from thermal and power maps, and progressively fine-tunes a model for each byte using the previously trained byte as a starting point, with convergence typically in two iterations. Experiments show that ITL reduces the measurement-to-disclosure metric for MLP and CNN on thermal maps and notably aids CNN performance on power maps, indicating improved data efficiency under limited data conditions. The results provide a practical framework for leveraging cross-byte correlations in SCA and establish a public dataset benchmark for evaluating data-efficient deep-learning based SCAs.

Abstract

This paper proposes the use of iterative transfer learning applied to deep learning models for side-channel attacks. Currently, most of the side-channel attack methods train a model for each individual byte, without considering the correlation between bytes. However, since the models' parameters for attacking different bytes may be similar, we can leverage transfer learning, meaning that we first train the model for one of the key bytes, then use the trained model as a pretrained model for the remaining bytes. This technique can be applied iteratively, a process known as iterative transfer learning. Experimental results show that when using thermal or power consumption map images as input, and multilayer perceptron or convolutional neural network as the model, our method improves average performance, especially when the amount of data is insufficient.

Improving Location-based Thermal Emission Side-Channel Analysis Using Iterative Transfer Learning

TL;DR

This work tackles data efficiency in deep-learning side-channel attacks on AES-128 by exploiting correlations across the 16 key bytes via iterative transfer learning (ITL). It uses preprocessing with Laplacian filtering and per-byte feature selection to construct compact inputs from thermal and power maps, and progressively fine-tunes a model for each byte using the previously trained byte as a starting point, with convergence typically in two iterations. Experiments show that ITL reduces the measurement-to-disclosure metric for MLP and CNN on thermal maps and notably aids CNN performance on power maps, indicating improved data efficiency under limited data conditions. The results provide a practical framework for leveraging cross-byte correlations in SCA and establish a public dataset benchmark for evaluating data-efficient deep-learning based SCAs.

Abstract

This paper proposes the use of iterative transfer learning applied to deep learning models for side-channel attacks. Currently, most of the side-channel attack methods train a model for each individual byte, without considering the correlation between bytes. However, since the models' parameters for attacking different bytes may be similar, we can leverage transfer learning, meaning that we first train the model for one of the key bytes, then use the trained model as a pretrained model for the remaining bytes. This technique can be applied iteratively, a process known as iterative transfer learning. Experimental results show that when using thermal or power consumption map images as input, and multilayer perceptron or convolutional neural network as the model, our method improves average performance, especially when the amount of data is insufficient.
Paper Structure (11 sections, 5 figures, 2 tables)

This paper contains 11 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: An example of the power consumption map.
  • Figure 2: An example of the thermal map.
  • Figure 3: An example of the filtered thermal map.
  • Figure 4: An example of the calculated map of standard deviation using a random split training set, together with the 16 POIs.
  • Figure 5: MTDs of MLP and CNN with or without ITL when the training data size varies.