Table of Contents
Fetching ...

Is Your Writing Being Mimicked by AI? Unveiling Imitation with Invisible Watermarks in Creative Writing

Ziwei Zhang, Juan Wen, Wanli Peng, Zhengxian Wu, Yinghan Zhou, Yiming Xue

TL;DR

Is Your Writing Being Mimicked by AI? identifies the risk of unauthorized imitation of creative writing via AI-driven knowledge injection and proposes WIND, a zero-watermarking, verifiable scheme that distills five elements of creative writing into a condensed implicit watermark within a disentangled space to preserve text quality while enabling ownership verification. The approach combines an instance-aware delimitation, LLM-based condensation, and watermark mapping to produce robust, non-disruptive copyright protection that remains effective under cross-model imitation and adversarial edits, with theoretical guarantees and empirical validation. On Shakespeare and ROCStories, WIND achieves F1 scores above $98\%$ and false-positive rates below $2\%$, outperforming state-of-the-art watermarking baselines and demonstrating resilience to robustness attacks. The work offers a practical, scalable framework for protecting textual datasets in real-world copyright contexts and sets path for extending implicit watermarking to broader textual domains.

Abstract

Efficient knowledge injection methods for Large Language Models (LLMs), such as In-Context Learning, knowledge editing, and efficient parameter fine-tuning, significantly enhance model utility on downstream tasks. However, they also pose substantial risks of unauthorized imitation and compromised data provenance for high-value unstructured data assets like creative works. Current copyright protection methods for creative works predominantly focus on visual arts, leaving a critical and unaddressed data engineering challenge in the safeguarding of creative writing. In this paper, we propose WIND (Watermarking via Implicit and Non-disruptive Disentanglement), a novel zero-watermarking, verifiable and implicit scheme that safeguards creative writing databases by providing verifiable copyright protection. Specifically, we decompose creative essence into five key elements, which are extracted utilizing LLMs through a designed instance delimitation mechanism and consolidated into condensed-lists. These lists enable WIND to convert core copyright attributes into verifiable watermarks via implicit encoding within a disentanglement creative space, where 'disentanglement' refers to the separation of creative-specific and creative-irrelevant features. This approach, utilizing implicit encoding, avoids distorting fragile textual content. Extensive experiments demonstrate that WIND effectively verifies creative writing copyright ownership against AI imitation, achieving F1 scores above 98% and maintaining robust performance under stringent low false-positive rates where existing state-of-the-art text watermarking methods struggle.

Is Your Writing Being Mimicked by AI? Unveiling Imitation with Invisible Watermarks in Creative Writing

TL;DR

Is Your Writing Being Mimicked by AI? identifies the risk of unauthorized imitation of creative writing via AI-driven knowledge injection and proposes WIND, a zero-watermarking, verifiable scheme that distills five elements of creative writing into a condensed implicit watermark within a disentangled space to preserve text quality while enabling ownership verification. The approach combines an instance-aware delimitation, LLM-based condensation, and watermark mapping to produce robust, non-disruptive copyright protection that remains effective under cross-model imitation and adversarial edits, with theoretical guarantees and empirical validation. On Shakespeare and ROCStories, WIND achieves F1 scores above and false-positive rates below , outperforming state-of-the-art watermarking baselines and demonstrating resilience to robustness attacks. The work offers a practical, scalable framework for protecting textual datasets in real-world copyright contexts and sets path for extending implicit watermarking to broader textual domains.

Abstract

Efficient knowledge injection methods for Large Language Models (LLMs), such as In-Context Learning, knowledge editing, and efficient parameter fine-tuning, significantly enhance model utility on downstream tasks. However, they also pose substantial risks of unauthorized imitation and compromised data provenance for high-value unstructured data assets like creative works. Current copyright protection methods for creative works predominantly focus on visual arts, leaving a critical and unaddressed data engineering challenge in the safeguarding of creative writing. In this paper, we propose WIND (Watermarking via Implicit and Non-disruptive Disentanglement), a novel zero-watermarking, verifiable and implicit scheme that safeguards creative writing databases by providing verifiable copyright protection. Specifically, we decompose creative essence into five key elements, which are extracted utilizing LLMs through a designed instance delimitation mechanism and consolidated into condensed-lists. These lists enable WIND to convert core copyright attributes into verifiable watermarks via implicit encoding within a disentanglement creative space, where 'disentanglement' refers to the separation of creative-specific and creative-irrelevant features. This approach, utilizing implicit encoding, avoids distorting fragile textual content. Extensive experiments demonstrate that WIND effectively verifies creative writing copyright ownership against AI imitation, achieving F1 scores above 98% and maintaining robust performance under stringent low false-positive rates where existing state-of-the-art text watermarking methods struggle.

Paper Structure

This paper contains 21 sections, 18 equations, 5 figures, 11 tables, 2 algorithms.

Figures (5)

  • Figure 1: The application scenario of the implicitly verifiable watermark.
  • Figure 2: The overall framework of WIND, which consists of three main phases: (1) Distance-driven Delimitation, (2) LLM-dominated Condensation, and (3) Watermark Mapping. The numbers within the squares of $\bm{T_H}$, $\bm{T_P}$, and $\bm{T_N}$ represent different samples. Additionally, a star in the entangled feature space and a diamond in the disentangled space (of the same color) denote the same sample.
  • Figure 3: Performance of WIND-G as an illustrative case.
  • Figure 4: The effectiveness of regularization penalty, where area within the dashed line represents the std deviation.
  • Figure 5: The performance when $num$ changes.