Table of Contents
Fetching ...

Watermarking Techniques for Large Language Models: A Survey

Yuqing Liang, Jiancheng Xiao, Wensheng Gan, Philip S. Yu

TL;DR

This survey addresses the problem of protecting IP and enabling traceability for large language models (LLMs) by reviewing watermarking approaches across text, image, audio, and multimodal outputs. It surveys traditional digital watermarking concepts and draws connections to LLM-specific techniques, including text-domain, image-domain, audio-domain, and dynamic watermarking, as well as cryptography-based and backdoor-based methods. Key contributions include a taxonomy of LLM watermarking techniques, analysis of inheritance from traditional watermarking, discussion of advantages and limitations, and a roadmap for multimodal and dynamic watermarking, along with challenges such as robustness, evaluation standards, and scalability. The work highlights practical implications for copyright protection, academic integrity, misinformation detection, and security verification, and emphasizes the need for standards, policy collaboration, and active defense strategies to realize robust, scalable IP protection for LLMs.

Abstract

With the rapid advancement and extensive application of artificial intelligence technology, large language models (LLMs) are extensively used to enhance production, creativity, learning, and work efficiency across various domains. However, the abuse of LLMs also poses potential harm to human society, such as intellectual property rights issues, academic misconduct, false content, and hallucinations. Relevant research has proposed the use of LLM watermarking to achieve IP protection for LLMs and traceability of multimedia data output by LLMs. To our knowledge, this is the first thorough review that investigates and analyzes LLM watermarking technology in detail. This review begins by recounting the history of traditional watermarking technology, then analyzes the current state of LLM watermarking research, and thoroughly examines the inheritance and relevance of these techniques. By analyzing their inheritance and relevance, this review can provide research with ideas for applying traditional digital watermarking techniques to LLM watermarking, to promote the cross-integration and innovation of watermarking technology. In addition, this review examines the pros and cons of LLM watermarking. Considering the current multimodal development trend of LLMs, it provides a detailed analysis of emerging multimodal LLM watermarking, such as visual and audio data, to offer more reference ideas for relevant research. This review delves into the challenges and future prospects of current watermarking technologies, offering valuable insights for future LLM watermarking research and applications.

Watermarking Techniques for Large Language Models: A Survey

TL;DR

This survey addresses the problem of protecting IP and enabling traceability for large language models (LLMs) by reviewing watermarking approaches across text, image, audio, and multimodal outputs. It surveys traditional digital watermarking concepts and draws connections to LLM-specific techniques, including text-domain, image-domain, audio-domain, and dynamic watermarking, as well as cryptography-based and backdoor-based methods. Key contributions include a taxonomy of LLM watermarking techniques, analysis of inheritance from traditional watermarking, discussion of advantages and limitations, and a roadmap for multimodal and dynamic watermarking, along with challenges such as robustness, evaluation standards, and scalability. The work highlights practical implications for copyright protection, academic integrity, misinformation detection, and security verification, and emphasizes the need for standards, policy collaboration, and active defense strategies to realize robust, scalable IP protection for LLMs.

Abstract

With the rapid advancement and extensive application of artificial intelligence technology, large language models (LLMs) are extensively used to enhance production, creativity, learning, and work efficiency across various domains. However, the abuse of LLMs also poses potential harm to human society, such as intellectual property rights issues, academic misconduct, false content, and hallucinations. Relevant research has proposed the use of LLM watermarking to achieve IP protection for LLMs and traceability of multimedia data output by LLMs. To our knowledge, this is the first thorough review that investigates and analyzes LLM watermarking technology in detail. This review begins by recounting the history of traditional watermarking technology, then analyzes the current state of LLM watermarking research, and thoroughly examines the inheritance and relevance of these techniques. By analyzing their inheritance and relevance, this review can provide research with ideas for applying traditional digital watermarking techniques to LLM watermarking, to promote the cross-integration and innovation of watermarking technology. In addition, this review examines the pros and cons of LLM watermarking. Considering the current multimodal development trend of LLMs, it provides a detailed analysis of emerging multimodal LLM watermarking, such as visual and audio data, to offer more reference ideas for relevant research. This review delves into the challenges and future prospects of current watermarking technologies, offering valuable insights for future LLM watermarking research and applications.
Paper Structure (50 sections, 19 figures, 7 tables)

This paper contains 50 sections, 19 figures, 7 tables.

Figures (19)

  • Figure 1: Taxonomy of LLM watermarking techniques.
  • Figure 2: Modifying the model generation process: sentence level hou2023semstamphou2024k.
  • Figure 3: SEMSTAMP vs. $k$-SEMSTAMP hou2024k.
  • Figure 4: (1) No watermark: About half the words in the text are in the green list. (2) Hard watermark: The words of the text are all green lists. (3) Soft watermark: The offset increases the probability of generating a green word.
  • Figure 5: Modifying the model generation process: token level liu2023privateliu2023semantic.
  • ...and 14 more figures