Exploring Transformer-Based Music Overpainting for Jazz Piano Variations
Eleanor Row, Ivan Shanin, György Fazekas
TL;DR
This work tackles the data bottleneck in music overpainting for jazz piano variations by constructing VAR4000, a large lead-sheet aligned dataset with 4,352 Original-Variation pairs (augmented to 52,224) to evaluate transformer-based models. It compares two transformer configurations on VAR4000 versus the smaller JAZZVAR dataset, finding that deeper architectures generalize better to the larger corpus and that VAR4000 supports improved scalability. The authors introduce a semi-automatic data pipeline, RemiPlus tokenization, and an evaluation approach that mitigates data leakage, establishing stronger baselines for future work. The study demonstrates the potential of scale-driven transformer models to support GenAI in music composition and highlights directions for dataset expansion, custom loss functions, and subjective expert evaluation.
Abstract
This paper explores transformer-based models for music overpainting, focusing on jazz piano variations. Music overpainting generates new variations while preserving the melodic and harmonic structure of the input. Existing approaches are limited by small datasets, restricting scalability and diversity. We introduce VAR4000, a subset of a larger dataset for jazz piano performances, consisting of 4,352 training pairs. Using a semi-automatic pipeline, we evaluate two transformer configurations on VAR4000, comparing their performance with the smaller JAZZVAR dataset. Preliminary results show promising improvements in generalisation and performance with the larger dataset configuration, highlighting the potential of transformer models to scale effectively for music overpainting on larger and more diverse datasets.
