Interactive Simulations of Backdoors in Neural Networks
Peter Bajcsy, Maxime Bros
TL;DR
This work introduces an interactive web-based platform to study cryptographic backdoors in neural networks, focusing on checksum-based backdoors injected into digital signature verification and activation functions. It formalizes a simple checksum $csum(v)$ and demonstrates how backdoors can be triggered via secret keys, while also implementing a proximity-based defense to detect adversarial inputs. The results illustrate both the feasibility and limitations of such backdoors in small-scale networks, as well as practical constraints for web-based interactivity and robustness. The framework serves as an educational and research tool to explore planting, activation, and defense dynamics of cryptographic backdoors in AI systems, with potential implications for model integrity in practice.
Abstract
This work addresses the problem of planting and defending cryptographic-based backdoors in artificial intelligence (AI) models. The motivation comes from our lack of understanding and the implications of using cryptographic techniques for planting undetectable backdoors under theoretical assumptions in the large AI model systems deployed in practice. Our approach is based on designing a web-based simulation playground that enables planting, activating, and defending cryptographic backdoors in neural networks (NN). Simulations of planting and activating backdoors are enabled for two scenarios: in the extension of NN model architecture to support digital signature verification and in the modified architectural block for non-linear operators. Simulations of backdoor defense against backdoors are available based on proximity analysis and provide a playground for a game of planting and defending against backdoors. The simulations are available at https://pages.nist.gov/nn-calculator
