Creative Ownership in the Age of AI
Annie Liang, Jay Lu
TL;DR
This paper reframes copyright concerns for generative AI by proposing a counterfactual infringement criterion: a generated output infringes an existing work if it could not have been generated without that work appearing in the training corpus. It models generation as a closure operator on a corpus of works and defines the permissible set as those outputs that do not depend essentially on any single input work. The key results show a monotone, closed structure for permissible generation and establish an asymptotic dichotomy: with light-tailed, gradual creative innovation, the permissible set eventually dominates (regulation becomes ineffective), whereas heavy-tailed innovation can leave a persistent violation set. The extension to general group violations and protected-collection frameworks highlights how coalition effects and unprotected works shape permissible generation and licensing dynamics. Together, the paper provides a formal, workable foundation for evaluating attribution, licensing, and regulation in AI-assisted creativity, with implications for policy and downstream innovation incentives.
Abstract
Copyright law focuses on whether a new work is "substantially similar" to an existing one, but generative AI can closely imitate style without copying content, a capability now central to ongoing litigation. We argue that existing definitions of infringement are ill-suited to this setting and propose a new criterion: a generative AI output infringes on an existing work if it could not have been generated without that work in its training corpus. To operationalize this definition, we model generative systems as closure operators mapping a corpus of existing works to an output of new works. AI generated outputs are \emph{permissible} if they do not infringe on any existing work according to our criterion. Our results characterize structural properties of permissible generation and reveal a sharp asymptotic dichotomy: when the process of organic creations is light-tailed, dependence on individual works eventually vanishes, so that regulation imposes no limits on AI generation; with heavy-tailed creations, regulation can be persistently constraining.
