Machine Learning needs Better Randomness Standards: Randomised Smoothing and PRNG-based attacks
Pranav Dahiya, Ilia Shumailov, Ross Anderson
TL;DR
This work reveals that randomness used in ML, especially within Randomised Smoothing, can be covertly manipulated via backdoored PRNGs, leading to miscalibrated robustness certificates. It introduces two attack classes—a naive noise-distribution swap and a bit-flipping PRNG attack—that can inflate or deflate the certified radius by large factors (up to $81\times$) while evading standard NIST tests. The paper demonstrates that existing randomness standards are insufficient for ML safety-critical settings and argues for ML-specific randomness guarantees and testing, including normality-focused diagnostics. The findings have practical implications for security and privacy in ML systems and motivate a shift toward more comprehensive, domain-aware randomness standards and defenses in ML toolchains.
Abstract
Randomness supports many critical functions in the field of machine learning (ML) including optimisation, data selection, privacy, and security. ML systems outsource the task of generating or harvesting randomness to the compiler, the cloud service provider or elsewhere in the toolchain. Yet there is a long history of attackers exploiting poor randomness, or even creating it -- as when the NSA put backdoors in random number generators to break cryptography. In this paper we consider whether attackers can compromise an ML system using only the randomness on which they commonly rely. We focus our effort on Randomised Smoothing, a popular approach to train certifiably robust models, and to certify specific input datapoints of an arbitrary model. We choose Randomised Smoothing since it is used for both security and safety -- to counteract adversarial examples and quantify uncertainty respectively. Under the hood, it relies on sampling Gaussian noise to explore the volume around a data point to certify that a model is not vulnerable to adversarial examples. We demonstrate an entirely novel attack, where an attacker backdoors the supplied randomness to falsely certify either an overestimate or an underestimate of robustness for up to 81 times. We demonstrate that such attacks are possible, that they require very small changes to randomness to succeed, and that they are hard to detect. As an example, we hide an attack in the random number generator and show that the randomness tests suggested by NIST fail to detect it. We advocate updating the NIST guidelines on random number testing to make them more appropriate for safety-critical and security-critical machine-learning applications.
