Table of Contents
Fetching ...

Machine Learning Cryptanalysis of a Quantum Random Number Generator

Nhan Duy Truong, Jing Yan Haw, Syed Muhamad Assad, Ping Koy Lam, Omid Kavehei

TL;DR

This work addresses the vulnerability of RNGs to adversarial environmental information by applying a predictive ML framework to a continuous-variable QRNG. Using a recurrent convolutional neural network, the authors quantify how deterministic classical noise can create learnable patterns in raw QRNG outputs and demonstrate that appropriate entropy extraction and post-processing suppress these patterns, restoring unpredictability. The study also tests ML on a congruential RNG and shows reduced predictability with longer periods, while QRNG outputs pass NIST randomness tests, validating the robustness of the approach as a benchmarking tool. Overall, ML provides a device-agnostic means to assess unpredictability and can guide design and processing choices to ensure cryptographic-quality randomness.

Abstract

Random number generators (RNGs) that are crucial for cryptographic applications have been the subject of adversarial attacks. These attacks exploit environmental information to predict generated random numbers that are supposed to be truly random and unpredictable. Though quantum random number generators (QRNGs) are based on the intrinsic indeterministic nature of quantum properties, the presence of classical noise in the measurement process compromises the integrity of a QRNG. In this paper, we develop a predictive machine learning (ML) analysis to investigate the impact of deterministic classical noise in different stages of an optical continuous variable QRNG. Our ML model successfully detects inherent correlations when the deterministic noise sources are prominent. After appropriate filtering and randomness extraction processes are introduced, our QRNG system, in turn, demonstrates its robustness against ML. We further demonstrate the robustness of our ML approach by applying it to uniformly distributed random numbers from the QRNG and a congruential RNG. Hence, our result shows that ML has potentials in benchmarking the quality of RNG devices.

Machine Learning Cryptanalysis of a Quantum Random Number Generator

TL;DR

This work addresses the vulnerability of RNGs to adversarial environmental information by applying a predictive ML framework to a continuous-variable QRNG. Using a recurrent convolutional neural network, the authors quantify how deterministic classical noise can create learnable patterns in raw QRNG outputs and demonstrate that appropriate entropy extraction and post-processing suppress these patterns, restoring unpredictability. The study also tests ML on a congruential RNG and shows reduced predictability with longer periods, while QRNG outputs pass NIST randomness tests, validating the robustness of the approach as a benchmarking tool. Overall, ML provides a device-agnostic means to assess unpredictability and can guide design and processing choices to ensure cryptographic-quality randomness.

Abstract

Random number generators (RNGs) that are crucial for cryptographic applications have been the subject of adversarial attacks. These attacks exploit environmental information to predict generated random numbers that are supposed to be truly random and unpredictable. Though quantum random number generators (QRNGs) are based on the intrinsic indeterministic nature of quantum properties, the presence of classical noise in the measurement process compromises the integrity of a QRNG. In this paper, we develop a predictive machine learning (ML) analysis to investigate the impact of deterministic classical noise in different stages of an optical continuous variable QRNG. Our ML model successfully detects inherent correlations when the deterministic noise sources are prominent. After appropriate filtering and randomness extraction processes are introduced, our QRNG system, in turn, demonstrates its robustness against ML. We further demonstrate the robustness of our ML approach by applying it to uniformly distributed random numbers from the QRNG and a congruential RNG. Hence, our result shows that ML has potentials in benchmarking the quality of RNG devices.

Paper Structure

This paper contains 14 sections, 5 equations, 10 figures, 1 table.

Figures (10)

  • Figure 1: Block diagram of the QRNG. A laser source is used to generate quantum entropy. During a measurement (denoted by “M”), statistics of the quantum state is inevitably mixed with the entropy of classical origin. By sacrificing partially random bits, a post-processing randomness extractor stage (denoted by “Ext”) transforms the distribution into a smaller output set with almost uniform distribution.
  • Figure 2: Data acquisition stages in the entropy blocks of the QRNG. Stage (a): (i) detector 1 and (ii) detector 2; Stage (b): (i) difference and (ii) sum of the photocurrents; Stage (c): difference of the photocurrents demodulated at (i) $1.375$ GHz (ii)$1.625$ GHz; Stage (d): Low pass filtering of the signals from (c).
  • Figure 3: Data preparation for each stage. Training set consists of $5$ million samples. Each test-set $\textrm{T}_{k}$ has $1$ million samples. Raw data is a sequence of $13$-bit integers collected at each stage. $N$ neighboring numbers are used as one input sample and the next number is considered as the label.
  • Figure 4: Recurrent convolutional neural network (RCNN) model: Two convolutional layers, marked by (1), are followed by a LSTM layer, marked by (2), and two fully-connected layers, marked by (3). As a single input into RCNN, $N$ ($100$) $13$-bit integers are firstly encoded into one-hot vectors. These $100$ one-hot vectors (blank squares) go to two convolutional layers each of which is followed by a max-pooling of size $2$. The first convolutional layer that has $64$ filters of length $5$, and the second one has $128$ filters with length of $3$. Outputs of the second convolutional layer are fed to the LSTM layer with output size of $128$. An unrolled representation of the LSTM layer is marked by (2i) where all green blocks are LSTM copies at different steps. Each green block takes a piece of input and information from its previous block to generate an output. Output of size $128$ generated the last block (marked by ($\ast$)) that has information of the whole input sequence is connected to $2$ fully-connected layers with output sizes of $64$ and $n$, where $n$ is the number of possible $13$-bit integer values in the dataset.
  • Figure 5: Probability distribution of the quadrature measurement of the vacuum. Maximum $P_X(x_i)$ of this set is $0.0137$ when $x_i=-26$.
  • ...and 5 more figures