Table of Contents
Fetching ...

Attribution-by-design: Ensuring Inference-Time Provenance in Generative Music Systems

Fabio Morreale, Wiebke Hutiri, Joan Serrà, Alice Xiang, Yuki Mitsufuji

TL;DR

The paper tackles the lack of provenance and fair remuneration in AI-generated music by introducing attribution-by-design, a framework that embeds inference-time attribution into the architecture of generative systems. It distinctively separates training and inference data, prioritizing inference-time attribution to provide verifiable links between generated outputs and specific reference songs. The authors outline a four-phase sociotechnical pipeline—user interaction, verification, generation, and compensation—that ensures transparent attribution, consent management, and tiered royalties. This approach aims to realign value capture with creators, offering a practical path toward ethical and scalable compensation in AI-assisted music production.

Abstract

The rise of AI-generated music is diluting royalty pools and revealing structural flaws in existing remuneration frameworks, challenging the well-established artist compensation systems in the music industry. Existing compensation solutions, such as piecemeal licensing agreements, lack scalability and technical rigour, while current data attribution mechanisms provide only uncertain estimates and are rarely implemented in practice. This paper introduces a framework for a generative music infrastructure centred on direct attribution, transparent royalty distribution, and granular control for artists and rights' holders. We distinguish ontologically between the training set and the inference set, which allows us to propose two complementary forms of attribution: training-time attribution and inference-time attribution. We here favour inference-time attribution, as it enables direct, verifiable compensation whenever an artist's catalogue is used to condition a generated output. Besides, users benefit from the ability to condition generations on specific songs and receive transparent information about attribution and permitted usage. Our approach offers an ethical and practical solution to the pressing need for robust compensation mechanisms in the era of AI-generated music, ensuring that provenance and fairness are embedded at the core of generative systems.

Attribution-by-design: Ensuring Inference-Time Provenance in Generative Music Systems

TL;DR

The paper tackles the lack of provenance and fair remuneration in AI-generated music by introducing attribution-by-design, a framework that embeds inference-time attribution into the architecture of generative systems. It distinctively separates training and inference data, prioritizing inference-time attribution to provide verifiable links between generated outputs and specific reference songs. The authors outline a four-phase sociotechnical pipeline—user interaction, verification, generation, and compensation—that ensures transparent attribution, consent management, and tiered royalties. This approach aims to realign value capture with creators, offering a practical path toward ethical and scalable compensation in AI-assisted music production.

Abstract

The rise of AI-generated music is diluting royalty pools and revealing structural flaws in existing remuneration frameworks, challenging the well-established artist compensation systems in the music industry. Existing compensation solutions, such as piecemeal licensing agreements, lack scalability and technical rigour, while current data attribution mechanisms provide only uncertain estimates and are rarely implemented in practice. This paper introduces a framework for a generative music infrastructure centred on direct attribution, transparent royalty distribution, and granular control for artists and rights' holders. We distinguish ontologically between the training set and the inference set, which allows us to propose two complementary forms of attribution: training-time attribution and inference-time attribution. We here favour inference-time attribution, as it enables direct, verifiable compensation whenever an artist's catalogue is used to condition a generated output. Besides, users benefit from the ability to condition generations on specific songs and receive transparent information about attribution and permitted usage. Our approach offers an ethical and practical solution to the pressing need for robust compensation mechanisms in the era of AI-generated music, ensuring that provenance and fairness are embedded at the core of generative systems.

Paper Structure

This paper contains 20 sections, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Two stages of attribution (TTA vs. ITA), and separation between inference and training datasets.
  • Figure 2: Attribution-by-design process.
  • Figure 4: Two possible interaction possibilities for audio-level UIs.
  • Figure 5: A diagram indicating how an LMM can retrieve relevant reference songs from the inference dataset.
  • Figure 6: Pipeline to use a reference song to condition a generation.
  • ...and 1 more figures