Table of Contents
Fetching ...

TRACE: Structure-Aware Character Encoding for Robust and Generalizable Document Watermarking

Jiale Meng, Jie Zhang, Runyi Hu, Zhe-Ming Lu, Tianwei Zhang, Yiming Li

Abstract

We propose TRACE, a structure-aware framework leveraging diffusion models for localized character encoding to embed data. Unlike existing methods that rely on edge features or pre-defined codebooks, TRACE exploits character structures that provide inherent resistance to noise interference due to their stability and unified representation across diverse characters. Our framework comprises three key components: (1) adaptive diffusion initialization that automatically identifies handle points, target points, and editing regions through specialized algorithms including movement probability estimator (MPE), target point estimation (TPE) and mask drawing model (MDM), (2) guided diffusion encoding for precise movement of selected point, and (3) masked region replacement with a specialized loss function to minimize feature alterations after the diffusion process. Comprehensive experiments demonstrate \name{}'s superior performance over state-of-the-art methods, achieving more than 5 dB improvement in PSNR and 5\% higher extraction accuracy following cross-media transmission. \name{} achieves broad generalizability across multiple languages and fonts, making it particularly suitable for practical document security applications.

TRACE: Structure-Aware Character Encoding for Robust and Generalizable Document Watermarking

Abstract

We propose TRACE, a structure-aware framework leveraging diffusion models for localized character encoding to embed data. Unlike existing methods that rely on edge features or pre-defined codebooks, TRACE exploits character structures that provide inherent resistance to noise interference due to their stability and unified representation across diverse characters. Our framework comprises three key components: (1) adaptive diffusion initialization that automatically identifies handle points, target points, and editing regions through specialized algorithms including movement probability estimator (MPE), target point estimation (TPE) and mask drawing model (MDM), (2) guided diffusion encoding for precise movement of selected point, and (3) masked region replacement with a specialized loss function to minimize feature alterations after the diffusion process. Comprehensive experiments demonstrate \name{}'s superior performance over state-of-the-art methods, achieving more than 5 dB improvement in PSNR and 5\% higher extraction accuracy following cross-media transmission. \name{} achieves broad generalizability across multiple languages and fonts, making it particularly suitable for practical document security applications.
Paper Structure (35 sections, 12 equations, 19 figures, 11 tables, 1 algorithm)

This paper contains 35 sections, 12 equations, 19 figures, 11 tables, 1 algorithm.

Figures (19)

  • Figure 1: Illustration of different document hiding methods. Image-based methods embed information by adjusting the proportion of black pixels but are vulnerable to noise from cross-media transmission, leading to extraction errors. Font-based methods rely on predefined character codebooks, which limits their generalizability when encountering unseen characters. In contrast, our method leverages character structures to guide diffusion-based encoding, ensuring broad applicability. Moreover, character structures are more stable to noise, providing strong robustness.
  • Figure 2: TRACE Embedding Pipeline. Step 1. Adaptive Diffusion Initialization: Handle points, target points, and masks are identified to guide the subsequent diffusion process. Step 2. Guided Diffusion Encoding: The diffusion model's UNet is fine-tuned via LoRA to better capture original image features. The marked image is then generated by controlled movement of handle points to target points within the predefined mask region. Step 3. Masked Region Replacement: The content within the mask of the original image is replaced with the corresponding encoded segments.
  • Figure 3: Illustration the direction of $P_h$ movements corresponding to its stroke in various orientations.
  • Figure 4: Robustness against Print-camera with varying viewpoint and distance.
  • Figure 5: Visual comparisons of encoded images of our TRACE and the comparison method yang2023autostegafont.
  • ...and 14 more figures