Table of Contents
Fetching ...

High Resolution Guitar Transcription via Domain Adaptation

Xavier Riley, Drew Edwards, Simon Dixon

TL;DR

The paper tackles the data scarcity problem in automatic music transcription for instruments other than piano by leveraging score-to-activation alignment with a high-resolution piano transcription model to train a guitar transcription system. It introduces a two-stage audio-score alignment workflow, using 79 professionally transcribed solo guitar scores to create a guitar-focused training set via domain adaptation. The approach yields state-of-the-art zero-shot transcription results on GuitarSet and competitive supervised performance, demonstrating robust generalization to diverse guitar timbres. By releasing a curated guitar annotation dataset and validating alignment accuracy, the work highlights a practical, data-efficient pathway to extend AMT to new instruments and potentially to other instrument families. The findings imply that digitized scores can effectively bootstrap AMT for instruments with limited labeled data, enabling broader applicability and future dataset expansion.

Abstract

Automatic music transcription (AMT) has achieved high accuracy for piano due to the availability of large, high-quality datasets such as MAESTRO and MAPS, but comparable datasets are not yet available for other instruments. In recent work, however, it has been demonstrated that aligning scores to transcription model activations can produce high quality AMT training data for instruments other than piano. Focusing on the guitar, we refine this approach to training on score data using a dataset of commercially available score-audio pairs. We propose the use of a high-resolution piano transcription model to train a new guitar transcription model. The resulting model obtains state-of-the-art transcription results on GuitarSet in a zero-shot context, improving on previously published methods.

High Resolution Guitar Transcription via Domain Adaptation

TL;DR

The paper tackles the data scarcity problem in automatic music transcription for instruments other than piano by leveraging score-to-activation alignment with a high-resolution piano transcription model to train a guitar transcription system. It introduces a two-stage audio-score alignment workflow, using 79 professionally transcribed solo guitar scores to create a guitar-focused training set via domain adaptation. The approach yields state-of-the-art zero-shot transcription results on GuitarSet and competitive supervised performance, demonstrating robust generalization to diverse guitar timbres. By releasing a curated guitar annotation dataset and validating alignment accuracy, the work highlights a practical, data-efficient pathway to extend AMT to new instruments and potentially to other instrument families. The findings imply that digitized scores can effectively bootstrap AMT for instruments with limited labeled data, enabling broader applicability and future dataset expansion.

Abstract

Automatic music transcription (AMT) has achieved high accuracy for piano due to the availability of large, high-quality datasets such as MAESTRO and MAPS, but comparable datasets are not yet available for other instruments. In recent work, however, it has been demonstrated that aligning scores to transcription model activations can produce high quality AMT training data for instruments other than piano. Focusing on the guitar, we refine this approach to training on score data using a dataset of commercially available score-audio pairs. We propose the use of a high-resolution piano transcription model to train a new guitar transcription model. The resulting model obtains state-of-the-art transcription results on GuitarSet in a zero-shot context, improving on previously published methods.
Paper Structure (11 sections, 1 figure, 4 tables)

This paper contains 11 sections, 1 figure, 4 tables.

Figures (1)

  • Figure 1: Diagram of the process used to validate the alignment accuracy of our proposed method