An Efficient Algorithm for Modulus Operation and Its Hardware Implementation in Prime Number Calculation

W. A. Susantha Wijesinghe

An Efficient Algorithm for Modulus Operation and Its Hardware Implementation in Prime Number Calculation

W. A. Susantha Wijesinghe

TL;DR

The paper presents a novel, FSM-based modulus operation algorithm for FPGA that relies solely on addition, subtraction, logical operations, and bit shifts, eliminating the need for multiplication and division. The method scales linearly with the bit-length difference and is demonstrated from 32-bit to 2048-bit operands, with a fitting cycle model of $y=2x+2$. Implemented in Verilog on a Xilinx Zynq-7000 board, the design achieves up to 295.5 MHz, uses as few as 7,920 LUTs for 2048-bit operations, and consumes 0.174 W. The authors apply the modulus unit to prime number calculation up to 500,000, showing substantial performance gains over software implementations and validating practicality for cryptographic applications. Overall, the work provides a hardware-efficient, scalable approach to modular arithmetic that can accelerate cryptographic protocols and related high-performance computations.

Abstract

This paper presents a novel algorithm for the modulus operation for FPGA implementation. The proposed algorithm use only addition, subtraction, logical, and bit shift operations, avoiding the complexities and hardware costs associated with multiplication and division. It demonstrates consistent performance across operand sizes ranging from 32-bit to 2048-bit, addressing scalability challenges in cryptographic applications. Implemented in Verilog HDL and tested on a Xilinx Zynq-7000 family FPGA, the algorithm shows a predictable linear scaling of cycle count with bit length difference (BLD), described by the equation $y=2x+2$, where $y$ represents the cycle count and $x$ represents the BLD. The application of this algorithm in prime number calculation up to 500,000 shows its practical utility and performance advantages. Comprehensive evaluations reveal efficient resource utilization, robust timing performance, and effective power management, making it suitable for high-performance and resource-constrained platforms. The results indicate that the proposed algorithm significantly improves the efficiency of modular arithmetic operations, with potential implications for cryptographic protocols and secure computing.

An Efficient Algorithm for Modulus Operation and Its Hardware Implementation in Prime Number Calculation

TL;DR

. Implemented in Verilog on a Xilinx Zynq-7000 board, the design achieves up to 295.5 MHz, uses as few as 7,920 LUTs for 2048-bit operations, and consumes 0.174 W. The authors apply the modulus unit to prime number calculation up to 500,000, showing substantial performance gains over software implementations and validating practicality for cryptographic applications. Overall, the work provides a hardware-efficient, scalable approach to modular arithmetic that can accelerate cryptographic protocols and related high-performance computations.

Abstract

, where

represents the cycle count and

represents the BLD. The application of this algorithm in prime number calculation up to 500,000 shows its practical utility and performance advantages. Comprehensive evaluations reveal efficient resource utilization, robust timing performance, and effective power management, making it suitable for high-performance and resource-constrained platforms. The results indicate that the proposed algorithm significantly improves the efficiency of modular arithmetic operations, with potential implications for cryptographic protocols and secure computing.

Paper Structure (16 sections, 5 equations, 10 figures, 6 tables, 2 algorithms)

This paper contains 16 sections, 5 equations, 10 figures, 6 tables, 2 algorithms.

Introduction
Related Work
Methodology
Hardware Architecture for Modulus Operation
Numerical Example
Hardware Implementation of the Modulus Operator
Application to Prime Number Calculation
Hardware Implementation of Prime Number Calculation
Measurements
Results and Discussion
Hardware Implementation of the Novel Algorithm for Modulus Operation
Limitations of the Algorithm
Prime Number Calculation
Comparison
Discussion
...and 1 more sections

Figures (10)

Figure 1: Finite State Machine of the modulus algorithm. Here $condition1 \equiv (divisor \leq dividend)~and~(!divisor[N-1]) ~and~(shift < N)$ and $condition2 \equiv (dividend < B)~or~(shift==0)~or~(shift \geq N)$
Figure 2: Register Transfer Level (RTL) diagram illustrating the hardware architecture for implementing the proposed modulus operation algorithm on an FPGA.
Figure 3: The waveform of the Verilog simulation of the proposed algorithm for modulus operation.
Figure 4: Block diagram illustrating the system for evaluating the modulus algorithm on the FPGA.
Figure 5: Detailed data path diagram of the prime calculation system, including registers, the modulus module, and control signals.
...and 5 more figures

An Efficient Algorithm for Modulus Operation and Its Hardware Implementation in Prime Number Calculation

TL;DR

Abstract

An Efficient Algorithm for Modulus Operation and Its Hardware Implementation in Prime Number Calculation

Authors

TL;DR

Abstract

Table of Contents

Figures (10)