An Efficient Algorithm for Modulus Operation and Its Hardware Implementation in Prime Number Calculation
W. A. Susantha Wijesinghe
TL;DR
The paper presents a novel, FSM-based modulus operation algorithm for FPGA that relies solely on addition, subtraction, logical operations, and bit shifts, eliminating the need for multiplication and division. The method scales linearly with the bit-length difference and is demonstrated from 32-bit to 2048-bit operands, with a fitting cycle model of $y=2x+2$. Implemented in Verilog on a Xilinx Zynq-7000 board, the design achieves up to 295.5 MHz, uses as few as 7,920 LUTs for 2048-bit operations, and consumes 0.174 W. The authors apply the modulus unit to prime number calculation up to 500,000, showing substantial performance gains over software implementations and validating practicality for cryptographic applications. Overall, the work provides a hardware-efficient, scalable approach to modular arithmetic that can accelerate cryptographic protocols and related high-performance computations.
Abstract
This paper presents a novel algorithm for the modulus operation for FPGA implementation. The proposed algorithm use only addition, subtraction, logical, and bit shift operations, avoiding the complexities and hardware costs associated with multiplication and division. It demonstrates consistent performance across operand sizes ranging from 32-bit to 2048-bit, addressing scalability challenges in cryptographic applications. Implemented in Verilog HDL and tested on a Xilinx Zynq-7000 family FPGA, the algorithm shows a predictable linear scaling of cycle count with bit length difference (BLD), described by the equation $y=2x+2$, where $y$ represents the cycle count and $x$ represents the BLD. The application of this algorithm in prime number calculation up to 500,000 shows its practical utility and performance advantages. Comprehensive evaluations reveal efficient resource utilization, robust timing performance, and effective power management, making it suitable for high-performance and resource-constrained platforms. The results indicate that the proposed algorithm significantly improves the efficiency of modular arithmetic operations, with potential implications for cryptographic protocols and secure computing.
