Track Role Prediction of Single-Instrumental Sequences
Changheon Han, Suhyun Lee, Minsam Ko
TL;DR
This work tackles track-role prediction for single-instrument sequences in both symbolic and audio domains. It adopts a cross-domain approach using fine-tuned pre-trained models—MusicBERT for MIDI data and PANNs for log-mel spectrograms—and evaluates on ComMU and SCM datasets. The best symbolic model reaches an accuracy of 0.871 and the best audio model 0.843, demonstrating strong cross-domain applicability for AI-driven music generation and analysis. The study also identifies limitations with diverse musical forms and proposes curriculum learning as a promising direction for future improvements.
Abstract
In the composition process, selecting appropriate single-instrumental music sequences and assigning their track-role is an indispensable task. However, manually determining the track-role for a myriad of music samples can be time-consuming and labor-intensive. This study introduces a deep learning model designed to automatically predict the track-role of single-instrumental music sequences. Our evaluations show a prediction accuracy of 87% in the symbolic domain and 84% in the audio domain. The proposed track-role prediction methods hold promise for future applications in AI music generation and analysis.
