Interpreting Graphic Notation with MusicLDM: An AI Improvisation of Cornelius Cardew's Treatise
Tornike Karchkhadze, Keren Shao, Shlomo Dubnov
TL;DR
Interpreting Cardew’s Treatise graphic notation is the problem addressed. The authors present a two-stage AI pipeline that first converts score imagery to descriptive prompts via ChatGPT 4o and then to audio with MusicLDM, augmented by an outpainting method to ensure seamless transitions. They demonstrate this approach by generating and analyzing multiple tracks and detailing their compositional setup and prompts. The work expands the creative toolkit for contemporary/experimental music by showing how cross-modal AI can translate abstract visuals into expressive sound and suggests concrete paths to further improve control via direct visual-to-audio mappings.
Abstract
This work presents a novel method for composing and improvising music inspired by Cornelius Cardew's Treatise, using AI to bridge graphic notation and musical expression. By leveraging OpenAI's ChatGPT to interpret the abstract visual elements of Treatise, we convert these graphical images into descriptive textual prompts. These prompts are then input into MusicLDM, a pre-trained latent diffusion model designed for music generation. We introduce a technique called "outpainting," which overlaps sections of AI-generated music to create a seamless and cohesive composition. We demostrate a new perspective on performing and interpreting graphic scores, showing how AI can transform visual stimuli into sound and expand the creative possibilities in contemporary/experimental music composition. Musical pieces are available at https://bit.ly/TreatiseAI
