Efficient Fault Detection Architectures for Modular Exponentiation Targeting Cryptographic Applications Benchmarked on FPGAs
Saeed Aghapour, Kasra Ahmadi, Mehran Mozaffari Kermani, Reza Azarderakhsh
TL;DR
The paper tackles fault attacks in modular exponentiation by introducing a recomputation-based fault-detection architecture that relies on input encoding. It proposes two schemes: a full recomputation approach with near-100% error coverage and a more efficient partial recomputation approach that preserves detection capability with substantially lower overhead. Encoding the base as $x_{enc}=x+k_{x}N$ and the exponent as $y_{enc}=y+k_{y}\phi(N)$ enables two independently encoded computations whose outputs can be cross-checked, ensuring integrity even under transient or permanent faults. Across software (ARM Cortex-A72) and hardware (Zynq Ultrascale+ and Artix-7 FPGA) implementations, the method achieves high fault coverage with only about $7.66\%$ computational overhead and less than $1\%$ area overhead, highlighting its practicality for cryptographic applications on constrained devices.
Abstract
Whether stemming from malicious intent or natural occurrences, faults and errors can significantly undermine the reliability of any architecture. In response to this challenge, fault detection assumes a pivotal role in ensuring the secure deployment of cryptosystems. Even when a cryptosystem boasts mathematical security, its practical implementation may remain susceptible to exploitation through side-channel attacks. In this paper, we propose a lightweight fault detection architecture tailored for modular exponentiation, a building block of numerous cryptographic applications spanning from classical cryptography to post quantum cryptography. Based on our simulation and implementation results on ARM Cortex-A72 processor, and AMD/Xilinx Zynq Ultrascale+, and Artix-7 FPGAs, our approach achieves an error detection rate close to 100%, all while introducing a modest computational overhead of approximately 7% and area overhead of less than 1% compared to the unprotected architecture. To the best of our knowledge, such an approach benchmarked on ARM processor and FPGA has not been proposed and assessed to date.
