Training-Free Watermarking for Autoregressive Image Generation
Yu Tong, Zihao Pan, Shuai Yang, Kaiyang Zhou
TL;DR
IndexMark introduces a training-free watermarking approach for autoregressive image generation by exploiting codebook redundancy through red–green index pairs and a match-then-replace embedding strategy. Watermark presence is verified via the green-index rate, enhanced by an Index Encoder and a cropping-robust verification protocol, with a maximum weight perfect matching formulation (M^* = arg max_M sum_{(i,j) in M} w(i,j)) solved on a pruned graph using the Blossom algorithm. The method balances watermark strength and image quality through confidence-guided index replacement (relative-conf_k = log(P(Idx_k)/P(Idx_k'))) and random red/green assignment, while verification relies on a Central Limit Theorem-based confidence interval and index reconstruction. Empirically, IndexMark achieves state-of-the-art image fidelity and verification accuracy across text- and class-conditioned autoregressive generation at multiple resolutions, demonstrating robustness to blur, noise, JPEG, color jitter, erasing, and cropping. The work provides a practical, scalable mechanism for image tracing and copyright protection in autoregressive, codebook-based generation, without requiring model fine-tuning.
Abstract
Invisible image watermarking can protect image ownership and prevent malicious misuse of visual generative models. However, existing generative watermarking methods are mainly designed for diffusion models while watermarking for autoregressive image generation models remains largely underexplored. We propose IndexMark, a training-free watermarking framework for autoregressive image generation models. IndexMark is inspired by the redundancy property of the codebook: replacing autoregressively generated indices with similar indices produces negligible visual differences. The core component in IndexMark is a simple yet effective match-then-replace method, which carefully selects watermark tokens from the codebook based on token similarity, and promotes the use of watermark tokens through token replacement, thereby embedding the watermark without affecting the image quality. Watermark verification is achieved by calculating the proportion of watermark tokens in generated images, with precision further improved by an Index Encoder. Furthermore, we introduce an auxiliary validation scheme to enhance robustness against cropping attacks. Experiments demonstrate that IndexMark achieves state-of-the-art performance in terms of image quality and verification accuracy, and exhibits robustness against various perturbations, including cropping, noises, Gaussian blur, random erasing, color jittering, and JPEG compression.
