Table of Contents
Fetching ...

DAWZY: A New Addition to AI powered "Human in the Loop" Music Co-creation

Aaron C Elkins, Sanchit Singh, Adrian Kieback, Sawyer Blankenship, Uyiosa Philip Amadasun, Aman Chadha

TL;DR

The paper tackles the challenge of translating high-level creative goals into precise DAW edits by introducing DAWZY, an open-source NL-to-ReaScript assistant for REAPER. DAWZY uses a Model Context Protocol-based toolset and GPT-5-driven code generation to ground actions in live project state and deliver reversible edits through a minimal GUI and multimodal input. It evaluates reliability across tasks, compares with Ableton-MCP, and reports positive subjective reception from users. The work demonstrates that natural-language control can augment human creativity in music production and outlines future directions for broader DAW compatibility and targeted fine-tuning for music scripting.

Abstract

Digital Audio Workstations (DAWs) offer fine control, but mapping high-level intent (e.g., "warm the vocals") to low-level edits breaks creative flow. Existing artificial intelligence (AI) music generators are typically one-shot, limiting opportunities for iterative development and human contribution. We present DAWZY, an open-source assistant that turns natural-language (text/voice/hum) requests into reversible actions in REAPER. DAWZY keeps the DAW as the creative hub with a minimal GUI and voice-first interface. DAWZY uses LLM-based code generation as a novel way to significantly reduce the time users spend familiarizing themselves with large interfaces, replacing hundreds of buttons and drop-downs with a chat box. DAWZY also uses three Model Context Protocol tools for live state queries, parameter adjustment, and AI beat generation. It maintains grounding by refreshing state before mutation and ensures safety and reversibility with atomic scripts and undo. In evaluations, DAWZY performed reliably on common production tasks and was rated positively by users across Usability, Control, Learning, Collaboration, and Enjoyment.

DAWZY: A New Addition to AI powered "Human in the Loop" Music Co-creation

TL;DR

The paper tackles the challenge of translating high-level creative goals into precise DAW edits by introducing DAWZY, an open-source NL-to-ReaScript assistant for REAPER. DAWZY uses a Model Context Protocol-based toolset and GPT-5-driven code generation to ground actions in live project state and deliver reversible edits through a minimal GUI and multimodal input. It evaluates reliability across tasks, compares with Ableton-MCP, and reports positive subjective reception from users. The work demonstrates that natural-language control can augment human creativity in music production and outlines future directions for broader DAW compatibility and targeted fine-tuning for music scripting.

Abstract

Digital Audio Workstations (DAWs) offer fine control, but mapping high-level intent (e.g., "warm the vocals") to low-level edits breaks creative flow. Existing artificial intelligence (AI) music generators are typically one-shot, limiting opportunities for iterative development and human contribution. We present DAWZY, an open-source assistant that turns natural-language (text/voice/hum) requests into reversible actions in REAPER. DAWZY keeps the DAW as the creative hub with a minimal GUI and voice-first interface. DAWZY uses LLM-based code generation as a novel way to significantly reduce the time users spend familiarizing themselves with large interfaces, replacing hundreds of buttons and drop-downs with a chat box. DAWZY also uses three Model Context Protocol tools for live state queries, parameter adjustment, and AI beat generation. It maintains grounding by refreshing state before mutation and ensures safety and reversibility with atomic scripts and undo. In evaluations, DAWZY performed reliably on common production tasks and was rated positively by users across Usability, Control, Learning, Collaboration, and Enjoyment.

Paper Structure

This paper contains 11 sections, 2 figures, 1 table.

Figures (2)

  • Figure 1: DAWZY Architecture. User intent (text/speech/hum) flows through the Electron gateway to the LLM and MCP tools, then executes as reversible ReaScripts in REAPER. Rounded rectangles denote AI/MCP components; sharp rectangles denote DAW/runtime components; dashed arrows indicate data queries; solid arrows indicate state-changing actions.
  • Figure 2: Mean Opinion Score (MOS) results for DAWZY (N=21). All categories scored above neutral (3).