MelodyT5: A Unified Score-to-Score Transformer for Symbolic Music Processing
Shangda Wu, Yashan Wang, Xiaobing Li, Feng Yu, Maosong Sun
TL;DR
MelodyT5 tackles data scarcity and task fragmentation in symbolic music by unifying seven melody-centric tasks as score-to-score transformations within a Transformer encoder-decoder framework. It introduces bar patching for ABC notation and a multi-layer architecture that includes a patch-level encoder/decoder and a character-level decoder, all pre-trained on MelodyHub to enable effective multi-task transfer learning. The dataset provides over 1{,}067{,}747 task instances across diverse tasks, enabling robust pre-training and evaluation. Experiments show MelodyT5 outperforms task-specific baselines on most tasks, with both objective gains (e.g., reduced BPB and improved CTRL, CTnCTR, PCS, MCTD, F1) and positive subjective feedback, highlighting the value of unified score-to-score modeling in symbolic music processing and offering a comprehensive resource for future work.
Abstract
In the domain of symbolic music research, the progress of developing scalable systems has been notably hindered by the scarcity of available training data and the demand for models tailored to specific tasks. To address these issues, we propose MelodyT5, a novel unified framework that leverages an encoder-decoder architecture tailored for symbolic music processing in ABC notation. This framework challenges the conventional task-specific approach, considering various symbolic music tasks as score-to-score transformations. Consequently, it integrates seven melody-centric tasks, from generation to harmonization and segmentation, within a single model. Pre-trained on MelodyHub, a newly curated collection featuring over 261K unique melodies encoded in ABC notation and encompassing more than one million task instances, MelodyT5 demonstrates superior performance in symbolic music processing via multi-task transfer learning. Our findings highlight the efficacy of multi-task transfer learning in symbolic music processing, particularly for data-scarce tasks, challenging the prevailing task-specific paradigms and offering a comprehensive dataset and framework for future explorations in this domain.
