Table of Contents
Fetching ...

Towards Next-Generation Steganalysis: LLMs Unleash the Power of Detecting Steganography

Minhao Bai. Jinshuai Yang, Kaiyi Pang, Huili Wang, Yongfeng Huang

TL;DR

This work reframes linguistic steganalysis as a generative task using instruction-tuned large language models (LLMs) rather than traditional classifiers. By fine-tuning Bloomz-7B1 and Llama-7B with LoRA while keeping the base models fixed, the approach activates human-like text perception to detect steganographic content across diverse datasets and embedding schemes. The results show strong domain-specific and domain-agnostic performance, with higher accuracy and F1 scores than baselines and robust transfer to unseen data. The study also analyzes prompt design effects and demonstrates the feasibility of a general, practical steganalysis model trained on diverse data, achieving near state-of-the-art detection with relatively modest training resources.

Abstract

Linguistic steganography provides convenient implementation to hide messages, particularly with the emergence of AI generation technology. The potential abuse of this technology raises security concerns within societies, calling for powerful linguistic steganalysis to detect carrier containing steganographic messages. Existing methods are limited to finding distribution differences between steganographic texts and normal texts from the aspect of symbolic statistics. However, the distribution differences of both kinds of texts are hard to build precisely, which heavily hurts the detection ability of the existing methods in realistic scenarios. To seek a feasible way to construct practical steganalysis in real world, this paper propose to employ human-like text processing abilities of large language models (LLMs) to realize the difference from the aspect of human perception, addition to traditional statistic aspect. Specifically, we systematically investigate the performance of LLMs in this task by modeling it as a generative paradigm, instead of traditional classification paradigm. Extensive experiment results reveal that generative LLMs exhibit significant advantages in linguistic steganalysis and demonstrate performance trends distinct from traditional approaches. Results also reveal that LLMs outperform existing baselines by a wide margin, and the domain-agnostic ability of LLMs makes it possible to train a generic steganalysis model (Both codes and trained models are openly available in https://github.com/ba0z1/Linguistic-Steganalysis-with-LLMs).

Towards Next-Generation Steganalysis: LLMs Unleash the Power of Detecting Steganography

TL;DR

This work reframes linguistic steganalysis as a generative task using instruction-tuned large language models (LLMs) rather than traditional classifiers. By fine-tuning Bloomz-7B1 and Llama-7B with LoRA while keeping the base models fixed, the approach activates human-like text perception to detect steganographic content across diverse datasets and embedding schemes. The results show strong domain-specific and domain-agnostic performance, with higher accuracy and F1 scores than baselines and robust transfer to unseen data. The study also analyzes prompt design effects and demonstrates the feasibility of a general, practical steganalysis model trained on diverse data, achieving near state-of-the-art detection with relatively modest training resources.

Abstract

Linguistic steganography provides convenient implementation to hide messages, particularly with the emergence of AI generation technology. The potential abuse of this technology raises security concerns within societies, calling for powerful linguistic steganalysis to detect carrier containing steganographic messages. Existing methods are limited to finding distribution differences between steganographic texts and normal texts from the aspect of symbolic statistics. However, the distribution differences of both kinds of texts are hard to build precisely, which heavily hurts the detection ability of the existing methods in realistic scenarios. To seek a feasible way to construct practical steganalysis in real world, this paper propose to employ human-like text processing abilities of large language models (LLMs) to realize the difference from the aspect of human perception, addition to traditional statistic aspect. Specifically, we systematically investigate the performance of LLMs in this task by modeling it as a generative paradigm, instead of traditional classification paradigm. Extensive experiment results reveal that generative LLMs exhibit significant advantages in linguistic steganalysis and demonstrate performance trends distinct from traditional approaches. Results also reveal that LLMs outperform existing baselines by a wide margin, and the domain-agnostic ability of LLMs makes it possible to train a generic steganalysis model (Both codes and trained models are openly available in https://github.com/ba0z1/Linguistic-Steganalysis-with-LLMs).
Paper Structure (17 sections, 7 equations, 8 figures, 8 tables)

This paper contains 17 sections, 7 equations, 8 figures, 8 tables.

Figures (8)

  • Figure 1: The detection accuracy of recently representative steganalysis methodsyang2019fastTS-CSWTS-RNNzou2020highniu2019hybridEILGFyang2020linguistic. Here we test two popular codebook construction methods, namely source coding-based method AC ziegler2019neural and distribution preserving method ADG zhang2021provably. The red dotted line represents the 0.5 accuracy guess line.
  • Figure 3: The architecture for instruction fine-tuning in Bloomz/Llama + LoRA. Input sentences and labels are filled into a reasonable prompt template for LLM's fine-tuning. The training of parameters is limited to only the LoRA branch, while the parameters of LLMs remain constant.
  • Figure 4: Distributions of text, estimated by log probability of sentences and the number of tokens in sentences. Each figure consists of the main scatter plot of joint distribution and lines of marginal distributions at top and right. (a) represents the distributions of 3 natural datasets, and (b) (c) (d) denote the distributions of AC/HC/ADG stegos within these datasets, respectively. The second row illustrates the normalized distribution, estimated by normalized log probability of sentences and the number of tokens in sentences, corresponding to the first row.
  • Figure 5: Detection accuracy of Bloomz/Llama training with 3/10 epochs and baseline methods training with 10 epochs on Movie datasets.
  • Figure 6: Detection accuracy of Bloomz/Llama training with 1-10 epochs on Movie-AC dataset.
  • ...and 3 more figures