Table of Contents
Fetching ...

ADLM -- stega: A Universal Adaptive Token Selection Algorithm for Improving Steganographic Text Quality via Information Entropy

Zezheng Qin, Congcong Sun, Taiyi He, Yuke He, Azizol Abdullah, Normalia Samian, Nuur Alifah Roslan

TL;DR

Experimental results demonstrate that reasonably controlling the candidate pool size and information entropy thresholds significantly enhances the quality and detection resistance of steganographic texts, showcasing broad application potential in the field of natural language processing.

Abstract

In the context of widespread global information sharing, information security and privacy protection have become focal points. Steganographic systems enhance information security by embedding confidential information into public carriers; however, existing generative text steganography methods face challenges in handling the long-tail distribution of candidate word pools, which impacts the imperceptibility of steganographic information. This paper proposes a quality control theory for steganographic text generation based on information entropy constraints, exploring the relationship between the imperceptibility of steganographic texts and information entropy. By controlling the information entropy of the candidate word pool within a specific range, we optimize the imperceptibility of the steganographic text. We establish upper and lower bounds for information entropy and introduce an adaptive truncation method to balance semantic coherence and lexical diversity. Experimental results demonstrate that reasonably controlling the candidate pool size and information entropy thresholds significantly enhances the quality and detection resistance of steganographic texts, showcasing broad application potential in the field of natural language processing.

ADLM -- stega: A Universal Adaptive Token Selection Algorithm for Improving Steganographic Text Quality via Information Entropy

TL;DR

Experimental results demonstrate that reasonably controlling the candidate pool size and information entropy thresholds significantly enhances the quality and detection resistance of steganographic texts, showcasing broad application potential in the field of natural language processing.

Abstract

In the context of widespread global information sharing, information security and privacy protection have become focal points. Steganographic systems enhance information security by embedding confidential information into public carriers; however, existing generative text steganography methods face challenges in handling the long-tail distribution of candidate word pools, which impacts the imperceptibility of steganographic information. This paper proposes a quality control theory for steganographic text generation based on information entropy constraints, exploring the relationship between the imperceptibility of steganographic texts and information entropy. By controlling the information entropy of the candidate word pool within a specific range, we optimize the imperceptibility of the steganographic text. We establish upper and lower bounds for information entropy and introduce an adaptive truncation method to balance semantic coherence and lexical diversity. Experimental results demonstrate that reasonably controlling the candidate pool size and information entropy thresholds significantly enhances the quality and detection resistance of steganographic texts, showcasing broad application potential in the field of natural language processing.

Paper Structure

This paper contains 22 sections, 18 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: The long-tail problem of the generative model. The underlined part is the image prediction section. Figure 1 (a) Human text prefix "Kamala Harris was born in 1964 in Oakland, California. Her mother was a cancer research scientist from India, and her father was an economics professor from Jamaica. Growing up in a multicultural environment, Harris faced the _ as a mixed-race individual." Figure 1 (b) Human text prefix "Kamala Harris was born in 1964 in Oakland, _."
  • Figure 2: Framework of the ADLM-stega approach.
  • Figure 3: Candidate set truncation process.
  • Figure 4: Relationship between the size of the candidate pool and the threshold
  • Figure 5: Results of the Ablation Experiment.
  • ...and 1 more figures