Table of Contents
Fetching ...

An Economic Solution to Copyright Challenges of Generative AI

Jiachen T. Wang, Zhun Deng, Hiroaki Chiba-Okabe, Boaz Barak, Weijie J. Su

TL;DR

This work proposes a framework that compensates copyright owners proportionally to their contributions to the creation of AI-generated content, leveraging the probabilistic nature of modern generative AI models and using techniques from cooperative game theory in economics.

Abstract

Generative artificial intelligence (AI) systems are trained on large data corpora to generate new pieces of text, images, videos, and other media. There is growing concern that such systems may infringe on the copyright interests of training data contributors. To address the copyright challenges of generative AI, we propose a framework that compensates copyright owners proportionally to their contributions to the creation of AI-generated content. The metric for contributions is quantitatively determined by leveraging the probabilistic nature of modern generative AI models and using techniques from cooperative game theory in economics. This framework enables a platform where AI developers benefit from access to high-quality training data, thus improving model performance. Meanwhile, copyright owners receive fair compensation, driving the continued provision of relevant data for generative model training. Experiments demonstrate that our framework successfully identifies the most relevant data sources used in artwork generation, ensuring a fair and interpretable distribution of revenues among copyright owners.

An Economic Solution to Copyright Challenges of Generative AI

TL;DR

This work proposes a framework that compensates copyright owners proportionally to their contributions to the creation of AI-generated content, leveraging the probabilistic nature of modern generative AI models and using techniques from cooperative game theory in economics.

Abstract

Generative artificial intelligence (AI) systems are trained on large data corpora to generate new pieces of text, images, videos, and other media. There is growing concern that such systems may infringe on the copyright interests of training data contributors. To address the copyright challenges of generative AI, we propose a framework that compensates copyright owners proportionally to their contributions to the creation of AI-generated content. The metric for contributions is quantitatively determined by leveraging the probabilistic nature of modern generative AI models and using techniques from cooperative game theory in economics. This framework enables a platform where AI developers benefit from access to high-quality training data, thus improving model performance. Meanwhile, copyright owners receive fair compensation, driving the continued provision of relevant data for generative model training. Experiments demonstrate that our framework successfully identifies the most relevant data sources used in artwork generation, ensuring a fair and interpretable distribution of revenues among copyright owners.
Paper Structure (30 sections, 16 equations, 12 figures)

This paper contains 30 sections, 16 equations, 12 figures.

Figures (12)

  • Figure 1: Overview of our method. (a) The artists provide their copyrighted artworks as (part of) the training data for the generative AI model. (b) A user prompts the generative AI and obtains a new artwork. (c) We assess the contribution of each artist to the AI-generated artwork using the Shapley Royalty Share, which determines their compensation.
  • Figure 2: Evaluation of the SRS using the WikiArt (upper) and FlickrLogo-27 datasets (lower): Each row displays example target images ($x^{(\text{gen})}$'s) for which the SRS is assessed. Left: The heatmap of the SRS of copyright owners in producing the original paintings from different artists (or original logo designs from different brands). Right: The heatmap of the SRS of copyright owners in producing AI-generated paintings in the style of different artists (or AI-generated logo designs of different brands).
  • Figure 3: Left: The corpus from PubMed Abstract we use as the query example for contribution evaluation. Right: The Shapley value for the copyrighted owners.
  • Figure 4: Left: The corpus from NIH ExPorter, a medical dataset we use as the query example for contribution evaluation. Right: The Shapley value for the copyrighted owners.
  • Figure 5: Left: The corpus from DeepMind Mathematics (DM Mathematics), a corpus that contains simple math questions which we use as the query example for contribution evaluation. Right: The Shapley value for the copyrighted owners.
  • ...and 7 more figures

Theorems & Definitions (1)

  • Remark 2.1: Adversarial data owners