CodeIP: A Grammar-Guided Multi-Bit Watermark for Large Language Models of Code
Batu Guan, Yao Wan, Zhangqian Bi, Zheng Wang, Hongyu Zhang, Pan Zhou, Lichao Sun
TL;DR
CodeIP presents a grammar-guided, multi-bit soft watermarking framework for LLM-generated code that embeds provenance information without sacrificing code utility. By combining a watermark logit with a lexical token type predictor and grammar constraints, CodeIP steers token generation to encode a watermark and enables robust extraction by re-simulating insertion with candidate messages. Empirical results across three LLMs and five programming languages show high watermark extraction rates (average around 0.95) and notably smaller CodeBLEU degradation compared to baselines, with the type predictor significantly helping preserve semantics. The approach offers practical IP protection for code-generation models and demonstrates resilience to partial-crop attacks, highlighting its potential for secure deployment of LLM-powered development tools.
Abstract
Large Language Models (LLMs) have achieved remarkable progress in code generation. It now becomes crucial to identify whether the code is AI-generated and to determine the specific model used, particularly for purposes such as protecting Intellectual Property (IP) in industry and preventing cheating in programming exercises. To this end, several attempts have been made to insert watermarks into machine-generated code. However, existing approaches are limited to inserting only a single bit of information. In this paper, we introduce CodeIP, a novel multi-bit watermarking technique that inserts additional information to preserve crucial provenance details, such as the vendor ID of an LLM, thereby safeguarding the IPs of LLMs in code generation. Furthermore, to ensure the syntactical correctness of the generated code, we propose constraining the sampling process for predicting the next token by training a type predictor. Experiments conducted on a real-world dataset across five programming languages demonstrate the effectiveness of CodeIP in watermarking LLMs for code generation while maintaining the syntactical correctness of code.
