FontGuard: A Robust Font Watermarking Approach Leveraging Deep Font Knowledge
Kahim Wong, Jicheng Zhou, Kemou Li, Yain-Whar Si, Xiaowei Wu, Jiantao Zhou
TL;DR
FontGuard tackles the challenge of robust font-based watermarking for AI-generated text by integrating a deep font model to synthesize high-quality, diverse watermarked fonts through hidden style-feature perturbations. It couples a font-manifold–driven encoder with a CLIP-based decoder trained via language-guided contrastive learning, enabling reliable bit recovery under realistic distortions. A generalized variant, FontGuard-GEN, enables watermark generation for unseen fonts without retraining, by incorporating style prompts and a style-consistent loss. Empirical results show substantial improvements in decoding accuracy under synthetic, cross-media, and OSN distortions (+5.4%, +7.4%, +5.8%), while achieving notable visual-quality gains (LPIPS reduction by 52.7% relative to baselines) and strong generalization to unseen fonts. The approach thus offers scalable, practical font watermarking for copyright protection, provenance, and compliance in AI-generated text.
Abstract
The proliferation of AI-generated content brings significant concerns on the forensic and security issues such as source tracing, copyright protection, etc, highlighting the need for effective watermarking technologies. Font-based text watermarking has emerged as an effective solution to embed information, which could ensure copyright, traceability, and compliance of the generated text content. Existing font watermarking methods usually neglect essential font knowledge, which leads to watermarked fonts of low quality and limited embedding capacity. These methods are also vulnerable to real-world distortions, low-resolution fonts, and inaccurate character segmentation. In this paper, we introduce FontGuard, a novel font watermarking model that harnesses the capabilities of font models and language-guided contrastive learning. Unlike previous methods that focus solely on the pixel-level alteration, FontGuard modifies fonts by altering hidden style features, resulting in better font quality upon watermark embedding. We also leverage the font manifold to increase the embedding capacity of our proposed method by generating substantial font variants closely resembling the original font. Furthermore, in the decoder, we employ an image-text contrastive learning to reconstruct the embedded bits, which can achieve desirable robustness against various real-world transmission distortions. FontGuard outperforms state-of-the-art methods by +5.4%, +7.4%, and +5.8% in decoding accuracy under synthetic, cross-media, and online social network distortions, respectively, while improving the visual quality by 52.7% in terms of LPIPS. Moreover, FontGuard uniquely allows the generation of watermarked fonts for unseen fonts without re-training the network. The code and dataset are available at https://github.com/KAHIMWONG/FontGuard.
