Towards Next-Generation Steganalysis: LLMs Unleash the Power of Detecting Steganography
Minhao Bai. Jinshuai Yang, Kaiyi Pang, Huili Wang, Yongfeng Huang
TL;DR
This work reframes linguistic steganalysis as a generative task using instruction-tuned large language models (LLMs) rather than traditional classifiers. By fine-tuning Bloomz-7B1 and Llama-7B with LoRA while keeping the base models fixed, the approach activates human-like text perception to detect steganographic content across diverse datasets and embedding schemes. The results show strong domain-specific and domain-agnostic performance, with higher accuracy and F1 scores than baselines and robust transfer to unseen data. The study also analyzes prompt design effects and demonstrates the feasibility of a general, practical steganalysis model trained on diverse data, achieving near state-of-the-art detection with relatively modest training resources.
Abstract
Linguistic steganography provides convenient implementation to hide messages, particularly with the emergence of AI generation technology. The potential abuse of this technology raises security concerns within societies, calling for powerful linguistic steganalysis to detect carrier containing steganographic messages. Existing methods are limited to finding distribution differences between steganographic texts and normal texts from the aspect of symbolic statistics. However, the distribution differences of both kinds of texts are hard to build precisely, which heavily hurts the detection ability of the existing methods in realistic scenarios. To seek a feasible way to construct practical steganalysis in real world, this paper propose to employ human-like text processing abilities of large language models (LLMs) to realize the difference from the aspect of human perception, addition to traditional statistic aspect. Specifically, we systematically investigate the performance of LLMs in this task by modeling it as a generative paradigm, instead of traditional classification paradigm. Extensive experiment results reveal that generative LLMs exhibit significant advantages in linguistic steganalysis and demonstrate performance trends distinct from traditional approaches. Results also reveal that LLMs outperform existing baselines by a wide margin, and the domain-agnostic ability of LLMs makes it possible to train a generic steganalysis model (Both codes and trained models are openly available in https://github.com/ba0z1/Linguistic-Steganalysis-with-LLMs).
