Probabilistic Analysis of Copyright Disputes and Generative AI Safety

Hiroaki Chiba-Okabe

Probabilistic Analysis of Copyright Disputes and Generative AI Safety

Hiroaki Chiba-Okabe

TL;DR

This paper develops a probabilistic formalism for evaluating copyright infringement disputes and the safety of generative AI. It formalizes evidentiary principles as binary random variables and conditional probabilities, enabling a rigorous analysis of the inverse ratio rule and its applicability under clearly specified monotonicity assumptions. It then assesses Near Access-Freeness (NAF) as a training-data risk-mitigation condition, deriving bounds like $P(Z=z|Access=1) \le e^{\epsilon} P(Z=z|Access=0)$ and linking NAF to inference about access via $A_M$ and $EA_M$ through the function $\Gamma(\epsilon, \delta)$. The results show that while the inverse ratio rule can be justified under natural assumptions and NAF can reduce infringement risk, both approaches face normative concerns and practical limitations, including transparency and retrospective attribution.

Abstract

This paper presents a probabilistic approach to analyzing copyright infringement disputes. Evidentiary principles shaped by case law are formalized in probabilistic terms, and the ``inverse ratio rule'' -- a controversial legal doctrine adopted by some courts -- is examined. Although this rule has faced significant criticism, a formal proof demonstrates its validity, provided it is properly defined. The probabilistic approach is further employed to study the copyright safety of generative AI. Specifically, the Near Access-Free (NAF) condition, previously proposed as a strategy for mitigating the heightened copyright infringement risks of generative AI, is evaluated. The analysis reveals limitations in its justifiability and efficacy.

Probabilistic Analysis of Copyright Disputes and Generative AI Safety

TL;DR

and linking NAF to inference about access via

and

through the function

. The results show that while the inverse ratio rule can be justified under natural assumptions and NAF can reduce infringement risk, both approaches face normative concerns and practical limitations, including transparency and retrospective attribution.

Abstract

Paper Structure (8 sections, 3 theorems, 2 equations)

This paper contains 8 sections, 3 theorems, 2 equations.

Introduction
Probabilistic Analysis of Copyright Disputes
Formalization of Evidentiary Principles
The Inverse Ratio Rule
Near Access-Freeness as a Framework for Copyright Safety of Generative AI
Concluding Remarks
The Inverse Ratio Rule with Strict Inequality
Proofs

Key Result

proposition 1

The following inverse ratio rule holds for any $\lambda\in(0,1)$: If $j \geq j^{\prime}$, then $\min\{i:P(C=1|S=i,\, EA=j)> \lambda\} \leq \min\{i:P(C=1|S=i,\, EA=j^{\prime})> \lambda\}$. Similarly, if $i\geq i^{\prime}$, then $\min\{j:P(C=1|S=i,\, EA=j)> \lambda\} \leq \min\{j:P(C=1|S=i^{\prime},\,

Theorems & Definitions (5)

definition 1
proposition 1
definition 2
proposition 2
proposition 3

Probabilistic Analysis of Copyright Disputes and Generative AI Safety

TL;DR

Abstract

Probabilistic Analysis of Copyright Disputes and Generative AI Safety

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (5)