Authorship and the Politics and Ethics of LLM Watermarks
Tim Räz
TL;DR
The paper examines the ethical, political, and philosophical implications of watermarking LLM outputs and proposes a contextual, weak notion of authorship that can include both humans and machines. It analyzes two watermarking schemes (Kirchenbauer et al. 2023 and Christ et al. 2023) to illustrate how injection/detection mechanisms interact with entropy and how control over detection (private vs public keys) affects authorship certification. The work highlights governance challenges, notably that private-key certification concentrates power in providers and can enable both false positives and negatives, and proposes a third-party watermarking authority as a potential remedy. It also argues that entropy differences across topics, registers, languages, and translations may produce differential detectability, raising fairness concerns about who benefits or is disadvantaged by watermark deployment, and calls for empirical validation and policy considerations.
Abstract
Recently, watermarking schemes for large language models (LLMs) have been proposed to distinguish text generated by machines and by humans. The present paper explores philosophical, political, and ethical ramifications of implementing and using watermarking schemes. A definition of authorship that includes both machines (LLMs) and humans is proposed to serve as a backdrop. It is argued that private watermarks may provide private companies with sweeping rights to determine authorship, which is incompatible with traditional standards of authorship determination. Then, possible ramifications of the so-called entropy dependence of watermarking mechanisms are explored. It is argued that entropy may vary for different, socially salient groups. This could lead to group dependent rates at which machine generated text is detected. Specifically, groups more interested in low entropy text may face the challenge that it is harder to detect machine generated text that is of interest to them.
