Live Music Models

Lyria Team; Antoine Caillon; Brian McWilliams; Cassie Tarakajian; Ian Simon; Ilaria Manco; Jesse Engel; Noah Constant; Yunpeng Li; Timo I. Denk; Alberto Lalama; Andrea Agostinelli; Cheng-Zhi Anna Huang; Ethan Manilow; George Brower; Hakan Erdogan; Heidi Lei; Itai Rolnick; Ivan Grishchenko; Manu Orsini; Matej Kastelic; Mauricio Zuluaga; Mauro Verzetti; Michael Dooley; Ondrej Skopek; Rafael Ferrer; Savvas Petridis; Zalán Borsos; Äaron van den Oord; Douglas Eck; Eli Collins; Jason Baldridge; Tom Hume; Chris Donahue; Kehang Han; Adam Roberts

Live Music Models

Lyria Team, Antoine Caillon, Brian McWilliams, Cassie Tarakajian, Ian Simon, Ilaria Manco, Jesse Engel, Noah Constant, Yunpeng Li, Timo I. Denk, Alberto Lalama, Andrea Agostinelli, Cheng-Zhi Anna Huang, Ethan Manilow, George Brower, Hakan Erdogan, Heidi Lei, Itai Rolnick, Ivan Grishchenko, Manu Orsini, Matej Kastelic, Mauricio Zuluaga, Mauro Verzetti, Michael Dooley, Ondrej Skopek, Rafael Ferrer, Savvas Petridis, Zalán Borsos, Äaron van den Oord, Douglas Eck, Eli Collins, Jason Baldridge, Tom Hume, Chris Donahue, Kehang Han, Adam Roberts

TL;DR

Live music models address the challenge of real-time, interactive AI-assisted performance by enabling continuous streaming with synchronized user control. The authors present Magenta RealTime (open-weights on-device) and Lyria RealTime (API-based) built on a codec language modeling framework that combines SpectroStream tokenization and MusicCoCa style embeddings, implemented through a chunk-based encoder-decoder Transformer to sustain real-time throughput. Key contributions include a unified live LM architecture with chunked autoregression, audio-text style conditioning, and audio-injection mechanisms validated through objective metrics and a user study, along with publicly demo-ready open and API-based systems. The work demonstrates a practical pathway to on-device and cloud-based AI-assisted live music creation, with future directions targeting ultra-low latency, multi-stem collaboration, and richer control interfaces for performers.

Abstract

We introduce a new class of generative models for music called live music models that produce a continuous stream of music in real-time with synchronized user control. We release Magenta RealTime, an open-weights live music model that can be steered using text or audio prompts to control acoustic style. On automatic metrics of music quality, Magenta RealTime outperforms other open-weights music generation models, despite using fewer parameters and offering first-of-its-kind live generation capabilities. We also release Lyria RealTime, an API-based model with extended controls, offering access to our most powerful model with wide prompt coverage. These models demonstrate a new paradigm for AI-assisted music creation that emphasizes human-in-the-loop interaction for live music performance.

Live Music Models

TL;DR

Abstract

Live Music Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)