Relatively-Secure LLM-Based Steganography via Constrained Markov Decision Processes

Yu-Shin Huang; Chao Tian; Krishna Narayanan; Lizhong Zheng

Relatively-Secure LLM-Based Steganography via Constrained Markov Decision Processes

Yu-Shin Huang, Chao Tian, Krishna Narayanan, Lizhong Zheng

TL;DR

This work reframes LLM-based steganography as a Constrained Markov Decision Process to account for long-term effects of embedding secret bits in generated text. By distilling the LM dynamics to a binary state-space and applying a discounted total-variation constraint, the authors derive a convex reformulation whose solution is a deterministic policy that resembles water-filling. The main result shows that, depending on the LM's transition structure, the optimal embedding policy either fixes one state's distribution, equalizes both states, or drives them toward uniformity, all while maintaining imperceptibility. This approach yields higher embedding efficiency without sacrificing naturalness, offering practical guidance on which token-generation states to adjust under a global constraint. The framework thus provides a principled, provably-optimal method for secure, efficient LLM-based steganography with potential applicability to resource-constrained settings where computational bounds on an adversary are relevant.

Abstract

Linguistic steganography aims to conceal information within natural language text without being detected. An effective steganography approach should encode the secret message into a minimal number of language tokens while preserving the natural appearance and fluidity of the stego-texts. We present a new framework to enhance the embedding efficiency of stego-texts generated by modifying the output of a large language model (LLM). The novelty of our approach is in abstracting the sequential steganographic embedding process as a Constrained Markov Decision Process (CMDP), which takes into consideration the long-term dependencies instead of merely the immediate effects. We constrain the solution space such that the discounted accumulative total variation divergence between the selected probability distribution and the original distribution given by the LLM is below a threshold. To find the optimal policy, we first show that the functional optimization problem can be simplified to a convex optimization problem with a finite number of variables. A closed-form solution for the optimal policy is then presented to this equivalent problem. It is remarkable that the optimal policy is deterministic and resembles water-filling in some cases. The solution suggests that usually adjusting the probability distribution for the state that has the least random transition probability should be prioritized, but the choice should be made by taking into account the transition probabilities at all states instead of only the current state.

Relatively-Secure LLM-Based Steganography via Constrained Markov Decision Processes

TL;DR

Abstract

Relatively-Secure LLM-Based Steganography via Constrained Markov Decision Processes

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (3)