An Efficient All-to-All GCD Algorithm for Low Entropy RSA Key Factorization

Elijah Pelofske

An Efficient All-to-All GCD Algorithm for Low Entropy RSA Key Factorization

Elijah Pelofske

TL;DR

The paper addresses the vulnerability of RSA moduli arising from low-entropy prime generation by enabling shared primes to be detected via batch GCD. It introduces the Binary Tree Batch GCD algorithm, which builds a product tree to aggregate gcd results into a single product $B$ of all non-trivial shared factors, then recovers the individual shared primes by gcd-ing $B$ with each modulus $N_i$. Compared to the prior remainder-tree batch GCD method, the proposed approach achieves similar asymptotic scaling but demonstrates a practical speedup of about $6\times$ in timing experiments on 1024- and 2048-bit moduli. This work enhances the efficiency of security assessments for RSA deployments and could extend to other cryptosystems that rely on large primes, enabling faster detection of weak, low-entropy keys.

Abstract

RSA is an incredibly successful and useful asymmetric encryption algorithm. One of the types of implementation flaws in RSA is low entropy of the key generation, specifically the prime number creation stage. This can occur due to flawed usage of random prime number generator libraries, or on computers where there is a lack of a source of external entropy. These implementation flaws result in some RSA keys sharing prime factors, which means that the full factorization of the public modulus can be recovered incredibly efficiently by performing a computation GCD between the two public key moduli that share the prime factor. However, since one does not know which of the composite moduli share a prime factor a-priori, to determine if any such shared prime factors exist, an all-to-all GCD attack (also known as a batch GCD attack, or a bulk GCD attack) can be performed on the available public keys so as to recover any shared prime factors. This study describes a novel all-to-all batch GCD algorithm, which will be referred to as the binary tree batch GCD algorithm, that is more efficient than the current best batch GCD algorithm (the remainder tree batch GCD algorithm). A comparison against the best existing batch GCD method (which is a product tree followed by a remainder tree computation) is given using a dataset of random RSA moduli that are constructed such that some of the moduli share prime factors. This proposed binary tree batch GCD algorithm has better runtime than the existing remainder tree batch GCD algorithm, although asymptotically it has nearly identical scaling and its complexity is dependent on how many shared prime factors exist in the set of RSA keys. In practice, the implementation of the proposed binary tree batch GCD algorithm has a roughly 6x speedup compared to the standard remainder tree batch GCD approach.

An Efficient All-to-All GCD Algorithm for Low Entropy RSA Key Factorization

TL;DR

of all non-trivial shared factors, then recovers the individual shared primes by gcd-ing

with each modulus

. Compared to the prior remainder-tree batch GCD method, the proposed approach achieves similar asymptotic scaling but demonstrates a practical speedup of about

in timing experiments on 1024- and 2048-bit moduli. This work enhances the efficiency of security assessments for RSA deployments and could extend to other cryptosystems that rely on large primes, enabling faster detection of weak, low-entropy keys.

Abstract

Paper Structure (6 sections, 4 figures, 1 table)

This paper contains 6 sections, 4 figures, 1 table.

Introduction
Terminology and Variable Definitions
Time Complexity Assumptions
Binary Tree Batch GCD Algorithm
Computational Complexity Timing Results
Discussion and Conclusion

Figures (4)

Figure 1: Diagram of the all-to-all GCD algorithm for $8$ integers labeled $N_1, \ldots N_8$. Blue lines denote the GCD operation, red arrows denote the integer multiplication operation, black circles denote integers, and the thin black ovals denote the logical pairs of integers that have a GCD operation performed on them and are then multiplied together. Because this example uses exactly $8$ integers, this is a complete binary tree. The algorithm starts at the leaf nodes, which are pairs of the input integers. A single GCD operation occurs between each pair, and then each pair is multiplied together (denoted by the two red arrows). This operation is performed recursively as the tree is built until the final step where the GCD between the product of one half of the integers and the other half of the integers is performed. Note that at that final step, there is no need to compute the product of those two large integers. In total, the number of GCD operations is exactly equal to the number of nodes in the binary tree (which in this diagram are represented as meta-groupings of nodes shown by the ovals), which is equal to the number of nodes in the binary tree minus $1$. The key characteristic of this recursive GCD approach is that combined all of the individual GCD computations have covered the entirety of the $\frac{M \cdot (M-1)}{2}$ edges in the complete all-to-all graph formed by the clique of the set of $N$ integers (where each edge on this conceptual clique is a GCD operation). This can be seen at the root of this binary tree, where that single GCD operation has covered $16$ individual GCD($N_i, N_j$) operations -- the output is of course not the distinct common divisors, but is instead the product of all of the common divisors between those two large integers. At each GCD operation in the binary tree, the result we get is either $1$ or a non-trivial greatest common divisor. If the GCD is a non trivial common divisor, then we record it - and specifically multiply all of the found non trivial common divisors together into a single integer denoted as $B$. $B$ is the thus the product of all of the shared factors within the set of input integers $N$.
Figure 2: Compute time scaling for both the original remainder tree batch GCD algorithm (blue), and the binary tree batch GCD approach proposed in this study (green). The top row used $2048$ bit RSA moduli, and the bottom row used $1024$ bit RSA moduli. The data shown in the left column had exactly $2$ weak RSA moduli ($2$ RSA keys shared a prime factor) in the pool of RSA keys, the plots in the middle column show the results for exactly $100$ weak RSA moduli, and the right column is for exactly $1000$ weak RSA moduli. The binary tree batch GCD scaling is consistently better than the remainder tree batch GCD algorithm.
Figure 3: Polynomial curve fit scaling, in terms of CPU compute time, for $1024$ bit RSA keys with exactly $1000$ weak keys. Left plot shows the scaling for the remainder tree batch GCD algorithm, and the right plot shows the scaling for the binary tree batch GCD algorithm.
Figure 4: Scaling of the multiplier of how much faster the binary tree batch GCD run was compared to the remainder tree batch GCD run, as a function of the number of RSA moduli being analyzed by the batch GCD algorithm.

An Efficient All-to-All GCD Algorithm for Low Entropy RSA Key Factorization

TL;DR

Abstract

An Efficient All-to-All GCD Algorithm for Low Entropy RSA Key Factorization

Authors

TL;DR

Abstract

Table of Contents

Figures (4)