Learning to Protect Communications with Adversarial Neural Cryptography
Martín Abadi, David G. Andersen
TL;DR
This work demonstrates that neural networks can learn to protect communications from an adversarial neural observer through end-to-end adversarial training, without prescribing cryptographic algorithms. By modeling Alice, Bob, and Eve and optimizing a joint objective that rewards Bob’s accurate recovery of plaintext while hindering Eve, the authors show that the networks can discover encryption and selective protection schemes. The experiments across varying key lengths reveal robust, key-dependent enciphering behavior and reveal challenges in training stability and generalization, offering a proof-of-concept for data-driven cryptographic protection. The results motivate further exploration of neural approaches to cryptography, privacy-preserving representations, and related robustness analyses.
Abstract
We ask whether neural networks can learn to use secret keys to protect information from other neural networks. Specifically, we focus on ensuring confidentiality properties in a multiagent system, and we specify those properties in terms of an adversary. Thus, a system may consist of neural networks named Alice and Bob, and we aim to limit what a third neural network named Eve learns from eavesdropping on the communication between Alice and Bob. We do not prescribe specific cryptographic algorithms to these neural networks; instead, we train end-to-end, adversarially. We demonstrate that the neural networks can learn how to perform forms of encryption and decryption, and also how to apply these operations selectively in order to meet confidentiality goals.
