Table of Contents
Fetching ...

Creative Ownership in the Age of AI

Annie Liang, Jay Lu

TL;DR

This paper reframes copyright concerns for generative AI by proposing a counterfactual infringement criterion: a generated output infringes an existing work if it could not have been generated without that work appearing in the training corpus. It models generation as a closure operator on a corpus of works and defines the permissible set as those outputs that do not depend essentially on any single input work. The key results show a monotone, closed structure for permissible generation and establish an asymptotic dichotomy: with light-tailed, gradual creative innovation, the permissible set eventually dominates (regulation becomes ineffective), whereas heavy-tailed innovation can leave a persistent violation set. The extension to general group violations and protected-collection frameworks highlights how coalition effects and unprotected works shape permissible generation and licensing dynamics. Together, the paper provides a formal, workable foundation for evaluating attribution, licensing, and regulation in AI-assisted creativity, with implications for policy and downstream innovation incentives.

Abstract

Copyright law focuses on whether a new work is "substantially similar" to an existing one, but generative AI can closely imitate style without copying content, a capability now central to ongoing litigation. We argue that existing definitions of infringement are ill-suited to this setting and propose a new criterion: a generative AI output infringes on an existing work if it could not have been generated without that work in its training corpus. To operationalize this definition, we model generative systems as closure operators mapping a corpus of existing works to an output of new works. AI generated outputs are \emph{permissible} if they do not infringe on any existing work according to our criterion. Our results characterize structural properties of permissible generation and reveal a sharp asymptotic dichotomy: when the process of organic creations is light-tailed, dependence on individual works eventually vanishes, so that regulation imposes no limits on AI generation; with heavy-tailed creations, regulation can be persistently constraining.

Creative Ownership in the Age of AI

TL;DR

This paper reframes copyright concerns for generative AI by proposing a counterfactual infringement criterion: a generated output infringes an existing work if it could not have been generated without that work appearing in the training corpus. It models generation as a closure operator on a corpus of works and defines the permissible set as those outputs that do not depend essentially on any single input work. The key results show a monotone, closed structure for permissible generation and establish an asymptotic dichotomy: with light-tailed, gradual creative innovation, the permissible set eventually dominates (regulation becomes ineffective), whereas heavy-tailed innovation can leave a persistent violation set. The extension to general group violations and protected-collection frameworks highlights how coalition effects and unprotected works shape permissible generation and licensing dynamics. Together, the paper provides a formal, workable foundation for evaluating attribution, licensing, and regulation in AI-assisted creativity, with implications for policy and downstream innovation incentives.

Abstract

Copyright law focuses on whether a new work is "substantially similar" to an existing one, but generative AI can closely imitate style without copying content, a capability now central to ongoing litigation. We argue that existing definitions of infringement are ill-suited to this setting and propose a new criterion: a generative AI output infringes on an existing work if it could not have been generated without that work in its training corpus. To operationalize this definition, we model generative systems as closure operators mapping a corpus of existing works to an output of new works. AI generated outputs are \emph{permissible} if they do not infringe on any existing work according to our criterion. Our results characterize structural properties of permissible generation and reveal a sharp asymptotic dichotomy: when the process of organic creations is light-tailed, dependence on individual works eventually vanishes, so that regulation imposes no limits on AI generation; with heavy-tailed creations, regulation can be persistently constraining.
Paper Structure (33 sections, 14 theorems, 89 equations, 4 figures)

This paper contains 33 sections, 14 theorems, 89 equations, 4 figures.

Key Result

Proposition 1

For every generator $g$, the permissible set satisfies:

Figures (4)

  • Figure 1: Consider the convex hull generator $g_{conv}$ and the corpus $C=\{c_1,c_2,\dots,c_5\}$. Panel (a):$V_{c_1}$ is the set of $c_1$-violations (which are only constructible using $c_1$), and $P_{c_1}$ is its complement. Right:$P = \bigcap_{c=1}^5 P_{c_i}$ is the set of points that are not $c_i$-violations for any $i=1,2,\dots,5$.
  • Figure 2: Replication of Figure \ref{['fig:violationCH']} for the box generator $g_{box}$.
  • Figure 3: Let $d=2$, so that creations are points in $\mathbb{R}^2$. Panel (a): No three points in general position can be separated into disjoint subsets whose convex hulls overlap, so the Radon number is at least three; Panel (b): Every set of four (or more) points can be separated in this way, for example in the figure let $A=\{c_1,c_3\}$ and $B=\{c_2,c_4\}$.
  • Figure 4: Two examples illustrating how adding a work can affect the permissible set under the convex hull generator. In Panel (a), adding $c_4$ leaves the permissible set unchanged: $p_g(\{c_1,c_2,c_3\})= \{c_2\}=p_g(\{c_1,c_2,c_3,c_4\})$. In Panel (b), adding $c_4$ strictly expands the permissible set: $p_g(\{c_1,c_2,c_3\})=\varnothing$ while $p_g(\{c_1,c_2,c_3,c_4\})$ is the singleton intersection point of the line segments $\overline{c_1c_3}$ and $\overline{c_2c_4}$, as indicated in red.

Theorems & Definitions (35)

  • Example 1: Novels
  • Example 2: Cartoons
  • Example 3: Actors
  • Definition 1
  • Example 4: Convex Hull Generator
  • Example 5: Splice Generator
  • Example 6: Box Generator
  • Definition 2
  • Definition 3
  • Definition 4
  • ...and 25 more