SparSamp: Efficient Provably Secure Steganography Based on Sparse Sampling
Yaofei Wang, Gang Pei, Kejiang Chen, Jinyang Ding, Chao Pan, Weilong Pang, Donghui Hu, Weiming Zhang
TL;DR
SparSamp introduces a provably secure steganography method for generative models that embeds messages without distorting the underlying probability distributions. It achieves this via message-driven sampling with MRNs and a sparse, index-based embedding scheme that yields $O(1)$ per-sampling-step complexity while preserving generation speed. The approach demonstrates high embedding speeds across text, image, and audio models, maintains low or zero KL divergence from the cover distributions, and remains compatible with token disambiguation techniques to handle TA. Empirical results show SparSamp outperforms prior PSS methods on speed and capacity while maintaining security, with practical potential for real-time covert communication in diverse AIGC tasks.
Abstract
Steganography embeds confidential data within seemingly innocuous communications. Provable security in steganography, a long-sought goal, has become feasible with deep generative models. However, existing methods face a critical trade-off between security and efficiency. This paper introduces SparSamp, an efficient provably secure steganography method based on sparse sampling. SparSamp embeds messages by combining them with pseudo-random numbers to obtain message-derived random numbers for sampling. It enhances extraction accuracy and embedding capacity by increasing the sampling intervals and making the sampling process sparse. SparSamp preserves the original probability distribution of the generative model, thus ensuring security. It introduces only $O(1)$ additional complexity per sampling step, enabling the fastest embedding speed without compromising generation speed. SparSamp is designed to be plug-and-play; message embedding can be achieved by simply replacing the sampling component of an existing generative model with SparSamp. We implemented SparSamp in text, image, and audio generation models. It can achieve embedding speeds of up to 755 bits/second with GPT-2, 5046 bits/second with DDPM, and 9,223 bits/second with WaveRNN.
