Stable Messenger: Steganography for Message-Concealed Image Generation

Quang Nguyen; Truong Vu; Cuong Pham; Anh Tran; Khoi Nguyen

Stable Messenger: Steganography for Message-Concealed Image Generation

Quang Nguyen, Truong Vu, Cuong Pham, Anh Tran, Khoi Nguyen

TL;DR

This work addresses the need for robust, practical evaluation of hidden-message steganography by introducing the message accuracy metric, which requires exact recovery of the entire embedded message. It proposes the Log-Sum-Exponential (LSE) loss to provide informative gradients focused on the most erroneous bits, improving full-message recovery, and a latent-aware encoding scheme that leverages a pretrained Stable Diffusion model to better align encoding with image content. The Stable Messenger framework supports both cover and generative modes, using a latent-aware encoder $E_m$ and a message decoder $D_m$ trained with image-reconstruction and message-reconstruction losses. Across MirFlickr, CLIC, and MetFaces, the approach achieves favorable image quality while significantly improving message accuracy, demonstrating robustness to several transformations and highlighting the practical utility for watermarking and ownership protection in real-world image generation systems.

Abstract

In the ever-expanding digital landscape, safeguarding sensitive information remains paramount. This paper delves deep into digital protection, specifically focusing on steganography. While prior research predominantly fixated on individual bit decoding, we address this limitation by introducing ``message accuracy'', a novel metric evaluating the entirety of decoded messages for a more holistic evaluation. In addition, we propose an adaptive universal loss tailored to enhance message accuracy, named Log-Sum-Exponential (LSE) loss, thereby significantly improving the message accuracy of recent approaches. Furthermore, we also introduce a new latent-aware encoding technique in our framework named \Approach, harnessing pretrained Stable Diffusion for advanced steganographic image generation, giving rise to a better trade-off between image quality and message recovery. Throughout experimental results, we have demonstrated the superior performance of the new LSE loss and latent-aware encoding technique. This comprehensive approach marks a significant step in evolving evaluation metrics, refining loss functions, and innovating image concealment techniques, aiming for more robust and dependable information protection.

Stable Messenger: Steganography for Message-Concealed Image Generation

TL;DR

and a message decoder

trained with image-reconstruction and message-reconstruction losses. Across MirFlickr, CLIC, and MetFaces, the approach achieves favorable image quality while significantly improving message accuracy, demonstrating robustness to several transformations and highlighting the practical utility for watermarking and ownership protection in real-world image generation systems.

Abstract

Paper Structure (12 sections, 4 equations, 6 figures, 6 tables)

This paper contains 12 sections, 4 equations, 6 figures, 6 tables.

Introduction
Related Work
Proposed Approach
Message Accuracy Metric
Loss-Sum-Exponential (LSE) Loss
Stable Messenger
Experiments
Experimental Setup
Comparison with Prior Methods
Robustness Evaluation
Ablation Study
Discussion

Figures (6)

Figure 1: Reconstructed images and messages of our method and prior work. Although the image quality of all methods is similar, the message recovered in our approach has a much better message accuracy (our proposed metric) than those of prior work.
Figure 2: Training of Stable Messenger. Given a real image $I$, we utilize the Image Encoder $E$ of SD to extract its latent $z$. Subsequently, the Latent-aware Message Encoder $E_m$ takes as input the message $m$ and latent $z$ to produce the message encoding $e$. Next, the SD Image Decoder $D$ receives the modified latent $z'=z+e$ to generate steganographic image $I'$. Optionally, $I'$ can be further transformed with some operations such as blur and compress to enhance the robustness. Finally, the Message Decoder $D_m$ recovers the hidden message $m'$ in $I'$. To train the network, we use two sets of loss functions: image reconstruction and message reconstruction.
Figure 3: Testing of Stable Messenger. There are two modes: cover mode and generative mode. In the cover mode, the real image $I$ is encoded into latent $z$ with a pretrained SD Image Encoder while in the generative mode, the pretrained UNet of SD transforms a noise $\epsilon$ to latent $z$. Subsequent steps are similar to those in the training of Stable Messenger. See more in caption of Fig. \ref{['fig:main-scheme']}.
Figure 4: Histograms of wrong bits with and without using the LSE loss in StegaStamp tancik2020stegastamp, RoSteALS bui2023rosteals, and ours.
Figure 5: Qualitative results in the cover mode. Different methods create their artifact on the cover image. The second row shows the residual between the cover image and the steganographic image (the residual is magnified $2\times$ for visualization purposes only).
...and 1 more figures

Stable Messenger: Steganography for Message-Concealed Image Generation

TL;DR

Abstract

Stable Messenger: Steganography for Message-Concealed Image Generation

Authors

TL;DR

Abstract

Table of Contents

Figures (6)