EMelodyGen: Emotion-Conditioned Melody Generation in ABC Notation with the Musical Feature Template
Monan Zhou, Xiaobing Li, Feng Yu, Wei Li
TL;DR
This work tackles emotion-conditioned melody generation in ABC notation by designing a musical feature template that maps emotional control to a set of controllable and embedded features, guided by correlations from emotion-labeled datasets and music psychology. To overcome data scarcity, it auto-labels well-structured scores to create Rough4Q, which, when used to fine-tune a Tunesformer backbone, achieves high parsing reliability ($\text{music21 parsing rate} \approx 99\%$) and strong alignment with human emotion perception ($\approx 91\%$ in blind tests). Ablation studies show that the five features—mode, tempo, pitchSD, RMS, and octave control—collectively drive emotional expression, with tempo, pitchSD, and mode being particularly impactful. The approach demonstrates that template-based emotion control, combined with strategic data augmentation and embedding, is a viable path for reliable emotion-conditioned melody generation in ABC notation, with practical implications for expressive symbolic music generation.
Abstract
The EMelodyGen system focuses on emotional melody generation in ABC notation controlled by the musical feature template. Owing to the scarcity of well-structured and emotionally labeled sheet music, we designed a template for controlling emotional melody generation by statistical correlations between musical features and emotion labels derived from small-scale emotional symbolic music datasets and music psychology conclusions. We then automatically annotated a large, well-structured sheet music collection with rough emotional labels by the template, converted them into ABC notation, and reduced label imbalance by data augmentation, resulting in a dataset named Rough4Q. Our system backbone pre-trained on Rough4Q can achieve up to 99% music21 parsing rate and melodies generated by our template can lead to a 91% alignment on emotional expressions in blind listening tests. Ablation studies further validated the effectiveness of the feature controls in the template. Available code and demos are at https://github.com/monetjoe/EMelodyGen.
