Statistical testing of random number generators and their improvement using randomness extraction

Cameron Foreman; Richie Yeung; Florian J. Curchod

Statistical testing of random number generators and their improvement using randomness extraction

Cameron Foreman, Richie Yeung, Florian J. Curchod

TL;DR

This study develops a tunable statistical testing environment (STE) for rigorous RNG evaluation, benchmarked on three widely used generators (32-bit LFSR, Intel RDSEED, IDQ Quantis) using multiple standard test suites. It introduces a four-tier randomness extraction hierarchy (deterministic, seeded, two-source, and physical device-independent) implemented via the Circulant extractor in Cryptomite, coupled with post-processing guided by min-entropy estimates. Across levels 2–4, post-processing significantly improves statistical properties, with level 4 leveraging semi-device-independent quantum protocols to certify additional entropy; however, some sources remain challenging due to intrinsic min-entropy limitations. The authors provide open-source access to STE and the extraction toolkit, demonstrating practical pathways to robust RNGs beyond standard certification tests and highlighting the limits of statistical testing in guaranteeing cryptographic unpredictability. These results have direct implications for designing and certifying cryptographic RNGs by combining diverse extraction paradigms with comprehensive, repeatable statistical testing.

Abstract

Random number generators (RNGs) are notoriously challenging to build and test, especially for cryptographic applications. While statistical tests cannot definitively guarantee an RNG's output quality, they are a powerful verification tool and the only universally applicable testing method. In this work, we design, implement, and present various post-processing methods, using randomness extractors, to improve the RNG output quality and compare them through statistical testing. We begin by performing intensive tests on three RNGs -- the 32-bit linear feedback shift register (LFSR), Intel's 'RDSEED,' and IDQuantique's 'Quantis' -- and compare their performance. Next, we apply the different post-processing methods to each RNG and conduct further intensive testing on the processed output. To facilitate this, we introduce a comprehensive statistical testing environment, based on existing test suites, that can be parametrised for lightweight (fast) to intensive testing.

Statistical testing of random number generators and their improvement using randomness extraction

TL;DR

Abstract

Paper Structure (44 sections, 19 equations, 8 figures, 33 tables)

This paper contains 44 sections, 19 equations, 8 figures, 33 tables.

Introduction
Related Work
Summary of Results
Tools and Definitions
Statistical Testing
Existing Test Suites
NIST Statistical Test Suite
Diehard(er) Statistical Test Suite
TestU01 Statistical Test Suite
ENT Statistical Test Suite
PractRand Statistical Test Suite
Our Statistical Testing Environment
Suggested Settings
Light
Recommended
...and 29 more sections

Figures (8)

Figure S1: This figure illustrates our implementation set-up. The black box represents one of the initial RNGs that we test, and the dashed box denotes the new---in principle, improved---RNG with additional post-processing applied.
Figure S2: An illustration of the set-up that we consider. An RNG generates a bit string $X=x$ of length $n$. In this work, we first study the statistical properties of the realisation $x$ of the (random variable) $X$. Then, we analyse the effects of different post-processing methods applied to it.
Figure S3: Illustration of the set of sources, or input distributions, that can be successfully extracted from by different randomness extraction methods. (Right) weak input distributions and (Left) second input, or weak seed, distributions. Deterministic extractors (level 1) require additional properties on the weak input but do not need a second input source. Seeded extractors (level 2) relax the need for additional properties of the weak input and extract from sources with min-entropy only, at the cost
Figure S4: The above plots show (left) the number of statistical tests failed and (right) failed and suspicious for each initial RNG at each post-processing level. The $x$ axis indicates the level, with step 0 being the initial RNG with no additional post-processing, and steps 1--4 are deterministic, seeded, two-source, and physical extraction, respectively. The $y$ axis is the number of statistical tests failed (left) or failed and suspicious (right), out of 4600, using a logarithmic scale: for $f$ failed or failed and suspicious tests, $y = \log_2(f+1)$. The shaded region in the left plot illustrates the successful region, whereby the RNG fails less than 7.5 tests, and the white region illustrates the 'unacceptable' region, in which, with high probability, near-perfect randomness is not produced. We note that we are unable to use the 32-bit LFSR at level 4 because of its low initial estimated min-entropy rate, $\alpha_{\mathsf{RNG}}$, as detailed and evaluated in \ref{['sec:rng-analysis']}.
Figure S5: Here, level 1 of our post-processing methods is performed by using a deterministic extractor, namely the Von Neumann extractor, on the initial output of the RNG.
...and 3 more figures

Theorems & Definitions (10)

Definition 1: Min-entropy
Definition 2: Block min-entropy
Definition 3: Statistical distance
Definition 4: $\epsilon$-perfect randomness
Definition 5: p-value
Definition 6: Deterministic randomness extractor
Definition 7: Seeded randomness extractor
Definition 8: Strong seeded extractor
Definition 9: Two-source randomness extractor
Definition 10: Strong two-source extractor

Statistical testing of random number generators and their improvement using randomness extraction

TL;DR

Abstract

Statistical testing of random number generators and their improvement using randomness extraction

Authors

TL;DR

Abstract

Table of Contents

Figures (8)

Theorems & Definitions (10)