Table of Contents
Fetching ...

Natias: Neuron Attribution based Transferable Image Adversarial Steganography

Zexin Fan, Kejiang Chen, Kai Zeng, Jiansong Zhang, Weiming Zhang, Nenghai Yu

TL;DR

A novel adversarial steganographic scheme named Natias, which attributes the output of a steganalytic model to each neuron in the target middle layer to identify critical features that may be adopted by diverse steganalytic models and can promote the transferability of adversarial steganography.

Abstract

Image steganography is a technique to conceal secret messages within digital images. Steganalysis, on the contrary, aims to detect the presence of secret messages within images. Recently, deep-learning-based steganalysis methods have achieved excellent detection performance. As a countermeasure, adversarial steganography has garnered considerable attention due to its ability to effectively deceive deep-learning-based steganalysis. However, steganalysts often employ unknown steganalytic models for detection. Therefore, the ability of adversarial steganography to deceive non-target steganalytic models, known as transferability, becomes especially important. Nevertheless, existing adversarial steganographic methods do not consider how to enhance transferability. To address this issue, we propose a novel adversarial steganographic scheme named Natias. Specifically, we first attribute the output of a steganalytic model to each neuron in the target middle layer to identify critical features. Next, we corrupt these critical features that may be adopted by diverse steganalytic models. Consequently, it can promote the transferability of adversarial steganography. Our proposed method can be seamlessly integrated with existing adversarial steganography frameworks. Thorough experimental analyses affirm that our proposed technique possesses improved transferability when contrasted with former approaches, and it attains heightened security in retraining scenarios.

Natias: Neuron Attribution based Transferable Image Adversarial Steganography

TL;DR

A novel adversarial steganographic scheme named Natias, which attributes the output of a steganalytic model to each neuron in the target middle layer to identify critical features that may be adopted by diverse steganalytic models and can promote the transferability of adversarial steganography.

Abstract

Image steganography is a technique to conceal secret messages within digital images. Steganalysis, on the contrary, aims to detect the presence of secret messages within images. Recently, deep-learning-based steganalysis methods have achieved excellent detection performance. As a countermeasure, adversarial steganography has garnered considerable attention due to its ability to effectively deceive deep-learning-based steganalysis. However, steganalysts often employ unknown steganalytic models for detection. Therefore, the ability of adversarial steganography to deceive non-target steganalytic models, known as transferability, becomes especially important. Nevertheless, existing adversarial steganographic methods do not consider how to enhance transferability. To address this issue, we propose a novel adversarial steganographic scheme named Natias. Specifically, we first attribute the output of a steganalytic model to each neuron in the target middle layer to identify critical features. Next, we corrupt these critical features that may be adopted by diverse steganalytic models. Consequently, it can promote the transferability of adversarial steganography. Our proposed method can be seamlessly integrated with existing adversarial steganography frameworks. Thorough experimental analyses affirm that our proposed technique possesses improved transferability when contrasted with former approaches, and it attains heightened security in retraining scenarios.
Paper Structure (19 sections, 21 equations, 5 figures, 9 tables, 1 algorithm)

This paper contains 19 sections, 21 equations, 5 figures, 9 tables, 1 algorithm.

Figures (5)

  • Figure 1: Comparison of workflows for existing logits-level adversarial steganographic methods and our proposed feature-level adversarial steganographic method.
  • Figure 2: Visualization of the attention distributions of three steganalytic models (SRNet, CovNet, and LWENet). The redder regions possess higher importance to the decision of the steganalytic model. The top row shows attention distributions of different steganalytic models when detecting the cover. The bottom row shows attention distributions of different steganalytic models when detecting the corresponding stego, which is generated by using our proposed Natias to attack CovNet. The regions enclosed by the red rectangles denote notable alterations in the attention maps.
  • Figure 3: The framework of our proposed Natias method. "$A$" denotes the neuron attribution results, "$\otimes$" denotes the element-wise product, "$\oplus$" denotes the element-wise addition, "$\eta$" denotes the coefficient controlling the magnitude of enhancing cover, "$\alpha$" denotes the coefficient controlling the magnitude of adjusting distortion, and "$\bm{msg}$" denotes the secret message to be embedded.
  • Figure 4: Detection accuracy evaluated on retrained steganalytic models compared with ADV-EMB. The y-axis represents the detection accuracy of the steganalytic model, and the x-axis represents the payload. The basic distortion function of the top row is HILL, and the basic distortion function of the bottom row is SUNIWARD.
  • Figure 5: Average Error Rate $P_{\text{E}}$ of different steganalytic models when detecting Natias and ADV-EMB under different target layers settings. The target steganalytic model is SRNet. "RetrainSRNet" represents the new classifier obtained by retraining on the stego images generated by attacking SRNet.