Table of Contents
Fetching ...

Data Hiding with Deep Learning: A Survey Unifying Digital Watermarking and Steganography

Zihan Wang, Olivia Byrnes, Hu Wang, Ruoxi Sun, Congbo Ma, Huaming Chen, Qi Wu, Minhui Xue

TL;DR

The paper tackles secure communication and intellectual property protection by surveying deep learning approaches to data hiding, unifying digital watermarking and steganography. It systematically analyzes encoder–decoder and GAN-based architectures, various noise-injection strategies, objective losses, evaluation metrics, and datasets, with emphasis on the trade-offs among capacity $R$, imperceptibility $I$, and robustness $A$. Key contributions include a comprehensive taxonomy of methods, performance comparisons (noting limitations of cross-study comparability), and a discussion of open questions such as watermarking for ML models, backdoor risks, and applications to synthetic media detection. The work highlights the practical impact of DL-based data hiding for trustworthy AI, secure media authentication, and defense against misuse across diverse media domains.

Abstract

The advancement of secure communication and identity verification fields has significantly increased through the use of deep learning techniques for data hiding. By embedding information into a noise-tolerant signal such as audio, video, or images, digital watermarking and steganography techniques can be used to protect sensitive intellectual property and enable confidential communication, ensuring that the information embedded is only accessible to authorized parties. This survey provides an overview of recent developments in deep learning techniques deployed for data hiding, categorized systematically according to model architectures and noise injection methods. The objective functions, evaluation metrics, and datasets used for training these data hiding models are comprehensively summarised. Additionally, potential future research directions that unite digital watermarking and steganography on software engineering to enhance security and mitigate risks are suggested and deliberated. This contribution furthers the creation of a more trustworthy digital world and advances Responsible AI.

Data Hiding with Deep Learning: A Survey Unifying Digital Watermarking and Steganography

TL;DR

The paper tackles secure communication and intellectual property protection by surveying deep learning approaches to data hiding, unifying digital watermarking and steganography. It systematically analyzes encoder–decoder and GAN-based architectures, various noise-injection strategies, objective losses, evaluation metrics, and datasets, with emphasis on the trade-offs among capacity , imperceptibility , and robustness . Key contributions include a comprehensive taxonomy of methods, performance comparisons (noting limitations of cross-study comparability), and a discussion of open questions such as watermarking for ML models, backdoor risks, and applications to synthetic media detection. The work highlights the practical impact of DL-based data hiding for trustworthy AI, secure media authentication, and defense against misuse across diverse media domains.

Abstract

The advancement of secure communication and identity verification fields has significantly increased through the use of deep learning techniques for data hiding. By embedding information into a noise-tolerant signal such as audio, video, or images, digital watermarking and steganography techniques can be used to protect sensitive intellectual property and enable confidential communication, ensuring that the information embedded is only accessible to authorized parties. This survey provides an overview of recent developments in deep learning techniques deployed for data hiding, categorized systematically according to model architectures and noise injection methods. The objective functions, evaluation metrics, and datasets used for training these data hiding models are comprehensively summarised. Additionally, potential future research directions that unite digital watermarking and steganography on software engineering to enhance security and mitigate risks are suggested and deliberated. This contribution furthers the creation of a more trustworthy digital world and advances Responsible AI.

Paper Structure

This paper contains 52 sections, 19 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: A hierarchical diagram showing different methods for classifying deep learning-based data hiding techniques. Blindness refers to the functionality of the data hiding method, further explained in Section \ref{['section:2']}.
  • Figure 2: A figure showing the trade-off between the three primary data hiding properties; robustness, imperceptibility, and capacity, as well as which data hiding applications favour each property over the others.
  • Figure 3: A hierarchical diagram showing the classification of deep learning-based data hiding models presented in this survey. 'Adversarial training' refers to attack simulation during training, which includes noise-based attacks generated by a trained CNN.
  • Figure 4: A diagram showing a general encoder-decoder architecture for digital watermarking.