Zero-shot Generative Linguistic Steganography

Ke Lin; Yiyang Luo; Zijian Zhang; Ping Luo

Zero-shot Generative Linguistic Steganography

Ke Lin, Yiyang Luo, Zijian Zhang, Ping Luo

TL;DR

This paper proposes a novel zero-shot approach based on in-context learning for linguistic steganography to achieve better perceptual and statistical imperceptibility and designs several new metrics and reproducible language evaluations to measure the imperceptibility of the stegotext.

Abstract

Generative linguistic steganography attempts to hide secret messages into covertext. Previous studies have generally focused on the statistical differences between the covertext and stegotext, however, ill-formed stegotext can readily be identified by humans. In this paper, we propose a novel zero-shot approach based on in-context learning for linguistic steganography to achieve better perceptual and statistical imperceptibility. We also design several new metrics and reproducible language evaluations to measure the imperceptibility of the stegotext. Our experimental results indicate that our method produces $1.926\times$ more innocent and intelligible stegotext than any other method.

Zero-shot Generative Linguistic Steganography

TL;DR

Abstract

more innocent and intelligible stegotext than any other method.

Paper Structure (44 sections, 7 equations, 9 figures, 7 tables, 4 algorithms)

This paper contains 44 sections, 7 equations, 9 figures, 7 tables, 4 algorithms.

Introduction
Our Contributions.
Background
Generative Linguistic Steganography
Statistical Imperceptibility
Methodology
Codec
Variable-length Coding.
Edge Flipping Coding.
Embedding
Hide & Extract.
Annealing Selection.
Repeat Penalty.
In-Context Stegotext Generation
Context Selection.
...and 29 more sections

Figures (9)

Figure 1: Generative linguistic steganography pipeline.
Figure 2: Example of EF coding. The red and green lines represent that the coding in bitstream changed (from 0 to 1, or from 1 to 0) and therefore will be assigned as 1 for the next iteration. The blue and yellow line represents that the coding in bitstream did not change (from 0 to 0, or from 1 to 1) and therefore will be assigned as 0 for the next iteration. Bitstream is processed through several rounds until the fewest 1s exist.
Figure 3: A running example of the embedding module and in-context stegotext generation. The selected context of samples from covertext in gray is used to instruct the stegotext generation. Huffman trees are constructed based on the conditional distribution, with the word matching the prefix of a bistream being selected as the next token. For example, the red token has 5 candidate words after applying probability pruning. Huffman tree is constructed for the 5 candidates and the word "off" matches the prefix of the bitstream and is chosen as the next word.
Figure 4: Examples of stegotext generated by various methods about movie reviews with a similar $\text{BPW}\approx2.5$, except ADG and SAAC. Compared with other baseline methods, our method generates more reasonable sentences.
Figure 5: Curves of BPW and JSDfull with respect to: (a) different threshold $\tau$ (b) different context size $k$
...and 4 more figures

Zero-shot Generative Linguistic Steganography

TL;DR

Abstract

Zero-shot Generative Linguistic Steganography

Authors

TL;DR

Abstract

Table of Contents

Figures (9)