Table of Contents
Fetching ...

Randomization Techniques to Mitigate the Risk of Copyright Infringement

Wei-Ning Chen, Peter Kairouz, Sewoong Oh, Zheng Xu

TL;DR

The paper tackles the problem of copyright infringement risk in large-scale language models by introducing randomized, Near Access-Freeness (NAF) based approaches that complement existing input- and output-based defenses. It formalizes $k_x$-NAF as a divergence-based measure between the unsafe model and safe models and proposes CP-$ abla$ (CP-$\Delta$) and CP-$\nabla$ as practical algorithms to achieve NAF, including a Monte Carlo method to empirically estimate $k_x$. Through experiments on token- and sentence-level generation, memorization tasks, and a federated StackOverflow setting, the work shows how these methods can mitigate generation of substantially similar copyrighted content and explores the tradeoffs with differential privacy. The study also demonstrates that additional randomization, such as higher decoding temperatures, can further strengthen NAF, and suggests future directions combining DP-based mechanisms with NAF-based guarantees and retrieval-based mitigation. Overall, the work provides a framework for quantifying and improving copyright-protective randomness in generative systems with practical auditing tools and empirical validation.

Abstract

In this paper, we investigate potential randomization approaches that can complement current practices of input-based methods (such as licensing data and prompt filtering) and output-based methods (such as recitation checker, license checker, and model-based similarity score) for copyright protection. This is motivated by the inherent ambiguity of the rules that determine substantial similarity in copyright precedents. Given that there is no quantifiable measure of substantial similarity that is agreed upon, complementary approaches can potentially further decrease liability. Similar randomized approaches, such as differential privacy, have been successful in mitigating privacy risks. This document focuses on the technical and research perspective on mitigating copyright violation and hence is not confidential. After investigating potential solutions and running numerical experiments, we concluded that using the notion of Near Access-Freeness (NAF) to measure the degree of substantial similarity is challenging, and the standard approach of training a Differentially Private (DP) model costs significantly when used to ensure NAF. Alternative approaches, such as retrieval models, might provide a more controllable scheme for mitigating substantial similarity.

Randomization Techniques to Mitigate the Risk of Copyright Infringement

TL;DR

The paper tackles the problem of copyright infringement risk in large-scale language models by introducing randomized, Near Access-Freeness (NAF) based approaches that complement existing input- and output-based defenses. It formalizes -NAF as a divergence-based measure between the unsafe model and safe models and proposes CP- (CP-) and CP- as practical algorithms to achieve NAF, including a Monte Carlo method to empirically estimate . Through experiments on token- and sentence-level generation, memorization tasks, and a federated StackOverflow setting, the work shows how these methods can mitigate generation of substantially similar copyrighted content and explores the tradeoffs with differential privacy. The study also demonstrates that additional randomization, such as higher decoding temperatures, can further strengthen NAF, and suggests future directions combining DP-based mechanisms with NAF-based guarantees and retrieval-based mitigation. Overall, the work provides a framework for quantifying and improving copyright-protective randomness in generative systems with practical auditing tools and empirical validation.

Abstract

In this paper, we investigate potential randomization approaches that can complement current practices of input-based methods (such as licensing data and prompt filtering) and output-based methods (such as recitation checker, license checker, and model-based similarity score) for copyright protection. This is motivated by the inherent ambiguity of the rules that determine substantial similarity in copyright precedents. Given that there is no quantifiable measure of substantial similarity that is agreed upon, complementary approaches can potentially further decrease liability. Similar randomized approaches, such as differential privacy, have been successful in mitigating privacy risks. This document focuses on the technical and research perspective on mitigating copyright violation and hence is not confidential. After investigating potential solutions and running numerical experiments, we concluded that using the notion of Near Access-Freeness (NAF) to measure the degree of substantial similarity is challenging, and the standard approach of training a Differentially Private (DP) model costs significantly when used to ensure NAF. Alternative approaches, such as retrieval models, might provide a more controllable scheme for mitigating substantial similarity.
Paper Structure (19 sections, 1 theorem, 7 equations, 6 figures, 1 table, 3 algorithms)

This paper contains 19 sections, 1 theorem, 7 equations, 6 figures, 1 table, 3 algorithms.

Key Result

Lemma 3.1

Assume $p(\cdot|x) \ll q_j(\cdot|x)$ for all $j \in [m]$. If $p(y|x) \geq \alpha$ and $q_j(y|x) \geq \alpha$ for all $y \in \mathsf{supp}\left( q_j(\cdot|x) \right)$, then it holds that, with probability $1-\delta$ where $k_x \triangleq \max_j \Delta_\mathsf{KL}(p(\cdot|x), q_j(\cdot|x))$, $r_j^n \triangleq \left( p(y_i|x)/ q_j(y_i|x) \right)_{i \in [n]}$ and $V_n(r_j^n)$ is the sample variance of

Figures (6)

  • Figure 1: A comparison between $k$-NAF (with $\Delta_\mathsf{max}$) and $\varepsilon$-DP (with $\delta = 10^{-6}$).
  • Figure 2: $\mathsf{CP}$-$\kappa$ with threshold $\kappa = \{1.0, 3.0, 5.0, 7.0\}$ We see that decreasing $\kappa$ does not necessarily reduce $k_x$. For instance, when $\kappa = 1.0$, most $k_x$'s are still greater than $1.0$.
  • Figure 3: Sentence-level NAF bounds based on Marte Carlo estimators on a fintuned GPT-2. We use a token-level $\mathsf{CP}$-$\Delta$ to ensemble models and plot the NAF with the sequence length.
  • Figure 4: Normalized edit distance between the generated samples and the true samples (duplicated in the training phase).
  • Figure 5: Sentence-level NAF bounds based with different decoding temperatures.
  • ...and 1 more figures

Theorems & Definitions (3)

  • Definition 2.1
  • Definition 2.2: Differentially private generation
  • Lemma 3.1