Optimizing Token Choice for Code Watermarking: An RL Approach
Zhimeng Guo, Huaisheng Zhu, Siyuan Xu, Hangfan Zhang, Teng Xiao, Minhao Cheng
TL;DR
CodeTracer introduces an adaptive, policy-driven watermarking framework for LLM-generated code that embeds detectable statistical watermarks during generation without compromising functionality. It couples a trainable watermark policy with a frozen base LLM, trained via GRPO to optimize both execution correctness and watermark detectability using a dual reward structure and differentiable discrete-token decisions (Straight-Through Estimation and Gumbel-Top-k). The approach demonstrates superior watermark detectability (AUROC) and maintained code quality (Pass@1) across Python and cross-language benchmarks, with limited computational overhead and strong robustness to attacks and model transfer. Practically, CodeTracer enables plug-in watermarking for diverse code-generation models, offering scalable IP protection and attribution in real-world AI code production. The framework advances watermarking by integrating syntactic awareness, verifiable rewards, and efficient learning to operate within the constraints of structured programming languages.
Abstract
Protecting intellectual property on LLM-generated code necessitates effective watermarking systems that can operate within code's highly structured, syntactically constrained nature. In this work, we introduce CodeTracer, an innovative adaptive code watermarking framework underpinned by a novel reinforcement learning training paradigm. At its core, CodeTracer features a policy-driven approach that utilizes a parameterized model to intelligently bias token choices during next-token prediction. This strategy ensures that embedded watermarks maintain code functionality while exhibiting subtle yet statistically detectable deviations from typical token distributions. To facilitate policy learning, we devise a comprehensive reward system that seamlessly integrates execution feedback with watermark embedding signals, balancing process-level and outcome-level rewards. Additionally, we employ Gumbel Top-k reparameterization to enable gradient-based optimization of discrete watermarking decisions. Extensive comparative evaluations demonstrate CodeTracer's significant superiority over state-of-the-art baselines in both watermark detectability and the preservation of generated code's functionality.
