Table of Contents
Fetching ...

Watermarking Large Language Models in Europe: Interpreting the AI Act in Light of Technology

Thomas Souverain

TL;DR

This paper addresses the EU AI Act's demand to watermark LLM outputs by operationalizing the criteria of reliability, interoperability, effectiveness, and robustness. It introduces a simple taxonomy of watermarking methods organized by the LLM lifecycle (pre-, in-, and post-processing) and by their placement in next-token distribution or sampling, together with an evaluation framework mapping detectability, robustness, and LLM quality to the Act’s requirements. The authors show that no current technique satisfies all four criteria, highlighting trade-offs among detectability, resilience to attacks, and impact on generation quality, and they advocate for embedding watermarks at the low-level architecture as a promising path. The work provides actionable guidance for regulators and providers, including open research directions on interoperability, standardized evaluative benchmarks (e.g., MarkLLM), and cross-architecture compatibility to improve trust and governance in European AI deployment. The operational interpretation and comparative lens aim to support compliant, scalable enforcement and to foster robust, trustworthy AI in Europe, while acknowledging the global evolution of standards such as C2PA and related policies.$p_ heta: V^{(t-1)} ightarrow \\Delta(V)$ for generation and transparency considerations are emphasized as part of the formal framing.$

Abstract

To foster trustworthy Artificial Intelligence (AI) within the European Union, the AI Act requires providers to mark and detect the outputs of their general-purpose models. The Article 50 and Recital 133 call for marking methods that are ''sufficiently reliable, interoperable, effective and robust''. Yet, the rapidly evolving and heterogeneous landscape of watermarks for Large Language Models (LLMs) makes it difficult to determine how these four standards can be translated into concrete and measurable evaluations. Our paper addresses this challenge, anchoring the normativity of European requirements in the multiplicity of watermarking techniques. Introducing clear and distinct concepts on LLM watermarking, our contribution is threefold. (1) Watermarking Categorisation: We propose an accessible taxonomy of watermarking methods according to the stage of the LLM lifecycle at which they are applied - before, during, or after training, and during next-token distribution or sampling. (2) Watermarking Evaluation: We interpret the EU AI Act's requirements by mapping each criterion with state-of-the-art evaluations on robustness and detectability of the watermark, and of quality of the LLM. Since interoperability remains largely untheorised in LLM watermarking research, we propose three normative dimensions to frame its assessment. (3) Watermarking Comparison: We compare current watermarking methods for LLMs against the operationalised European criteria and show that no approach yet satisfies all four standards. Encouraged by emerging empirical tests, we recommend further research into watermarking directly embedded within the low-level architecture of LLMs.

Watermarking Large Language Models in Europe: Interpreting the AI Act in Light of Technology

TL;DR

This paper addresses the EU AI Act's demand to watermark LLM outputs by operationalizing the criteria of reliability, interoperability, effectiveness, and robustness. It introduces a simple taxonomy of watermarking methods organized by the LLM lifecycle (pre-, in-, and post-processing) and by their placement in next-token distribution or sampling, together with an evaluation framework mapping detectability, robustness, and LLM quality to the Act’s requirements. The authors show that no current technique satisfies all four criteria, highlighting trade-offs among detectability, resilience to attacks, and impact on generation quality, and they advocate for embedding watermarks at the low-level architecture as a promising path. The work provides actionable guidance for regulators and providers, including open research directions on interoperability, standardized evaluative benchmarks (e.g., MarkLLM), and cross-architecture compatibility to improve trust and governance in European AI deployment. The operational interpretation and comparative lens aim to support compliant, scalable enforcement and to foster robust, trustworthy AI in Europe, while acknowledging the global evolution of standards such as C2PA and related policies. for generation and transparency considerations are emphasized as part of the formal framing.$

Abstract

To foster trustworthy Artificial Intelligence (AI) within the European Union, the AI Act requires providers to mark and detect the outputs of their general-purpose models. The Article 50 and Recital 133 call for marking methods that are ''sufficiently reliable, interoperable, effective and robust''. Yet, the rapidly evolving and heterogeneous landscape of watermarks for Large Language Models (LLMs) makes it difficult to determine how these four standards can be translated into concrete and measurable evaluations. Our paper addresses this challenge, anchoring the normativity of European requirements in the multiplicity of watermarking techniques. Introducing clear and distinct concepts on LLM watermarking, our contribution is threefold. (1) Watermarking Categorisation: We propose an accessible taxonomy of watermarking methods according to the stage of the LLM lifecycle at which they are applied - before, during, or after training, and during next-token distribution or sampling. (2) Watermarking Evaluation: We interpret the EU AI Act's requirements by mapping each criterion with state-of-the-art evaluations on robustness and detectability of the watermark, and of quality of the LLM. Since interoperability remains largely untheorised in LLM watermarking research, we propose three normative dimensions to frame its assessment. (3) Watermarking Comparison: We compare current watermarking methods for LLMs against the operationalised European criteria and show that no approach yet satisfies all four standards. Encouraged by emerging empirical tests, we recommend further research into watermarking directly embedded within the low-level architecture of LLMs.

Paper Structure

This paper contains 21 sections, 1 equation, 2 figures, 1 table.

Figures (2)

  • Figure 1: Watermarking a LLM by altering the Generation Process: Four Steps
  • Figure 2: Interpreting the AI Act Criteria for LLM Watermarking: From Overlaps to Measurable Standards