Table of Contents
Fetching ...

Alkaid: Resilience to Edit Errors in Provably Secure Steganography via Distance-Constrained Encoding

Zhihan Cao, Gaolei Li, Jun Wu, Jianhua Li, Hang Zhang, Mingzhe Chen

TL;DR

Alkaid is proposed, a provably secure steganographic scheme resilient to edit errors via distance-constrained encoding that integrates the minimum distance decoding principle directly into the encoding process by enforcing a strict lower bound on the edit distance between codewords of different messages.

Abstract

While provably secure steganography provides strong concealment by ensuring stego carriers are indistinguishable from natural samples, such systems remain vulnerable to real-world edit errors (e.g., insertions, deletions, substitutions) because their decoding depends on perfect synchronization and lacks error-correcting capability. To bridge this gap, we propose Alkaid, a provably secure steganographic scheme resilient to edit errors via distance-constrained encoding. The key innovation integrates the minimum distance decoding principle directly into the encoding process by enforcing a strict lower bound on the edit distance between codewords of different messages. Specifically, if two candidate codewords violate this bound, they are merged to represent the same message, thereby guaranteeing reliable recovery. While maintaining provable security, we theoretically prove that Alkaid offers deterministic robustness against bounded errors. To implement this scheme efficiently, we adopt block-wise and batch processing. Extensive experiments demonstrate that Alkaid achieves decoding success rates of 99\% to 100\% across diverse error channels, delivers a payload of 0.2 bits per token for high embedding capacity, and maintains an encoding speed of 6.72 bits per second, significantly surpassing state-of-the-art (SOTA) methods in robustness, capacity, and efficiency.

Alkaid: Resilience to Edit Errors in Provably Secure Steganography via Distance-Constrained Encoding

TL;DR

Alkaid is proposed, a provably secure steganographic scheme resilient to edit errors via distance-constrained encoding that integrates the minimum distance decoding principle directly into the encoding process by enforcing a strict lower bound on the edit distance between codewords of different messages.

Abstract

While provably secure steganography provides strong concealment by ensuring stego carriers are indistinguishable from natural samples, such systems remain vulnerable to real-world edit errors (e.g., insertions, deletions, substitutions) because their decoding depends on perfect synchronization and lacks error-correcting capability. To bridge this gap, we propose Alkaid, a provably secure steganographic scheme resilient to edit errors via distance-constrained encoding. The key innovation integrates the minimum distance decoding principle directly into the encoding process by enforcing a strict lower bound on the edit distance between codewords of different messages. Specifically, if two candidate codewords violate this bound, they are merged to represent the same message, thereby guaranteeing reliable recovery. While maintaining provable security, we theoretically prove that Alkaid offers deterministic robustness against bounded errors. To implement this scheme efficiently, we adopt block-wise and batch processing. Extensive experiments demonstrate that Alkaid achieves decoding success rates of 99\% to 100\% across diverse error channels, delivers a payload of 0.2 bits per token for high embedding capacity, and maintains an encoding speed of 6.72 bits per second, significantly surpassing state-of-the-art (SOTA) methods in robustness, capacity, and efficiency.
Paper Structure (27 sections, 3 theorems, 27 equations, 8 figures, 7 tables, 6 algorithms)

This paper contains 27 sections, 3 theorems, 27 equations, 8 figures, 7 tables, 6 algorithms.

Key Result

Theorem 1

Let $\Pi$ be a steganographic scheme in which the encoder $\mathsf{Enc}$ uses distance-constrained encoding, and its encoding parameter $\xi = (\zeta_1, \dots, \zeta_k, \eta)$ consists of mutually independent elements sampled randomly from their respective spaces, and all message spaces $\mathcal{M}

Figures (8)

  • Figure 1: The process of distance-constrained encoding involves four mainsteps: (i)Codebook Construction, where a generative model $\mathcal{G_\theta}$ utilizes encoding parameters $\xi$ to produce $k$ sequences as codewords; (ii)Distance-Constrained Grouping, in which codewords with edit distances below a given threshold $d_{\mathcal{T}}$ are grouped together; (iii)Adaptive Message Encoding, which dynamically allocates codes to represents specific message within each group; (iv)Sequence Selection, where the specific secret message and encoding parameters $\xi$ together determine the unique sequence to serve as the stego carrier.
  • Figure 2: An example of adaptive message encoding: Given a group distribution $[3/4, 1/8, 1/16, 1/16]$, we construct a binary tree of depth 5. The last layer contains 16 nodes, each representing a codeword. The encoding for each group is the common prefix of its codewords, resulting in the final codes: $[\varnothing, 110, 1110, 1111]$.
  • Figure 3: Overview of Alkaid. This scheme requires both the sender and receiver to use the same generative model, historical context, and secret key. On the sender side, the secret message is divided into blocks, each of which undergoes distance-constrained encoding until the entire message is embedded into the stegotext. On the receiver side, the same distance-constrained encoding process is repeated to reconstruct the codebook. Each message block is then recovered using minimum distance decoding, and finally reassembled to obtain the original secret message.
  • Figure 4: Decoding success rate comparison at different error rates on edit error channel $\mathcal{E}_e$.
  • Figure 5: Decoding success rate comparison across different token-level errors.
  • ...and 3 more figures

Theorems & Definitions (12)

  • Definition 1: Generative Model
  • Definition 2: Edit Error Channel
  • Definition 3: Pseudorandom Generator
  • Definition 4: Generative Steganographic Scheme
  • Definition 5: Information-Theoretic Security
  • Definition 6: Computational Security
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • proof
  • ...and 2 more