Sanitizing Hidden Information with Diffusion Models

Preston K. Robinette; Daniel Moyer; Taylor T. Johnson

Sanitizing Hidden Information with Diffusion Models

Preston K. Robinette, Daniel Moyer, Taylor T. Johnson

TL;DR

This paper addresses the rising risk of digital steganography by proposing DM-SUDS, a blind sanitization method based on diffusion models that removes hidden information from image-into-image hiding while preserving image quality. It demonstrates that diffusion-based denoising can outperform prior VAE-based approaches (SUDS) and simple noise schemes across CIFAR-10 and ImageNet, including JPEG-resistant hiding, and introduces a sanitization specification that jointly optimizes safety (secret removal) and utility (image fidelity). The work includes an ablation study, an audio-domain case study, and shows DM-SUDS’s applicability to diverse domains via pretrained diffusion models. Collectively, these results indicate a significant advancement in blind steganography sanitization with broad practical implications for secure communications and multimedia security.

Abstract

Information hiding is the process of embedding data within another form of data, often to conceal its existence or prevent unauthorized access. This process is commonly used in various forms of secure communications (steganography) that can be used by bad actors to propagate malware, exfiltrate victim data, and discreetly communicate. Recent work has utilized deep neural networks to remove this hidden information in a defense mechanism known as sanitization. Previous deep learning works, however, are unable to scale efficiently beyond the MNIST dataset. In this work, we present a novel sanitization method called DM-SUDS that utilizes a diffusion model framework to sanitize/remove hidden information from image-into-image universal and dependent steganography from CIFAR-10 and ImageNet datasets. We evaluate DM-SUDS against three different baselines using MSE, PSNR, SSIM, and NCC metrics and provide further detailed analysis through an ablation study. DM-SUDS outperforms all three baselines and significantly improves image preservation MSE by 50.44%, PSNR by 12.69%, SSIM by 11.49%, and NCC by 3.26% compared to previous deep learning approaches. Additionally, we introduce a novel evaluation specification that considers the successful removal of hidden information (safety) as well as the resulting quality of the sanitized image (utility). We further demonstrate the versatility of this method with an application in an audio case study, demonstrating its broad applicability to additional domains.

Sanitizing Hidden Information with Diffusion Models

TL;DR

Abstract

Paper Structure (14 sections, 4 equations, 6 figures, 2 tables)

This paper contains 14 sections, 4 equations, 6 figures, 2 tables.

Preliminaries
Steganography
Sanitization
Sanitization Metrics
DM-SUDS Sanitization
Research Questions and Metrics
Experimental Results
Sanitization Performance of DM-SUDS
Effect of Timestep on Sanitization
Ability to Directly Denoise
Scalability to ImageNet and Robustness (JPEG)
Evaluation
Case Study: Application to Audio Steganography
Conclusion

Figures (6)

Figure 1: DM-SUDS (center) takes as input a cover or a container image $x_0\in\{C, C'\}$ created with any type of steganographic technique. A noisy image is then sampled from this image at timestep $t$. In the reverse diffusion process, a denoising U-Net is used to predict the amount of noise added to the image, which is then used to recover the original image, resulting in a sanitized image $\hat{C}$. In the pre-sanitization phase, the secret is recoverable, as demonstrated in the bottom-left of the figure $S'$. After sanitization with DM-SUDS, however, a secret is not recoverable, as indicated by the bottom-right of the figure $\hat{S}$. The secret, therefore, is successfully eliminated.
Figure 2: Information Hiding Diagram.
Figure 3: A comparison between DM-SUDS and SUDS sanitization for a) LSB, b) DDH, and c) UDH steganography. Sanitization performance is determined from image preservation as well as secret elimination. While the secrets are eliminated with each method, DM-SUDS is the only method able to preserve the image.
Figure 4: Example of sanitized images $\hat{C}$ and attempted revealed secrets after sanitization $\hat{S}$ on DDH containers across various timesteps.
Figure 5: Image preservation (IP) and secret elimination (SE) NCC metrics from sanitizing DDH, UDH, and LSB containers on various timesteps using added noise (a) and no added noise (b).
...and 1 more figures

Theorems & Definitions (1)

Definition 1: Sanitization

Sanitizing Hidden Information with Diffusion Models

TL;DR

Abstract

Sanitizing Hidden Information with Diffusion Models

Authors

TL;DR

Abstract

Table of Contents

Figures (6)

Theorems & Definitions (1)