Transformer models as an efficient replacement for statistical test suites to evaluate the quality of random numbers
Rishabh Goel, YiZi Xiao, Ramin Ramezani
TL;DR
The paper addresses the need for efficient validation of randomness, particularly for QRNG outputs, by replacing the slow, per-test NIST STS with an encoder-only Transformer that predicts multi-label passing probabilities for several STS tests. Through a hyper-parameter search, the authors identify a compact model (1 encoder layer, single attention head, 192 embedding size) with an averaging mechanism that achieves Macro F1 near 0.96 and runs substantially faster than NIST STS and even LSTM baselines. The approach demonstrates that Transformers can parallelize and scale randomness evaluation while maintaining accuracy, offering a practical path toward replacing traditional test suites. The work suggests broad applicability to real-time randomness validation and lays groundwork for extending encoding to the full set of NIST STS tests.
Abstract
Random numbers are incredibly important in a variety of fields, and the need for their validation remains important for safety. A Quantum Random Number Generator (QRNG) can theoretically generate truly random numbers, however their quality still needs to be thoroughly validated. Generally, the task of validating random numbers has been delegated to different statistical tests such as the tests from the NIST Statistical Test Suite (STS), which are often slow and only perform one test at a time. Our work presents a deep learning model utilizing the Transformer architecture that 1) performs multiple NIST STS tests at once, and 2) runs much faster. This model outputs multi-label classification results on passing these statistical tests. We performed a thorough hyper-parameter optimization to converge on the best possible model and as a result, achieved a high degree of accuracy with a Macro F1-score of above 0.96. We also compared this model to a conventional deep learning method (Long Short Term Memory Recurrent Neural Networks) to quantify randomness and showed our model achieved similar performances while being much more efficient and scalable. The high performance and efficiency of this Transformer-based deep learning model showed that it can be a viable replacement for the NIST STS for validating random numbers.
