The GigaMIDI Dataset with Features for Expressive Music Performance Detection
Keon Ju Maverick Lee, Jeff Ens, Sara Adkins, Pedro Sarmento, Mathieu Barthet, Philippe Pasquier
TL;DR
This work addresses the challenge of distinguishing expressive versus non-expressive MIDI performances at track level by leveraging a massive symbolic music corpus, GigaMIDI, and a set of novel heuristics. The authors introduce three expressiveness detectors—DNVR, DNODR, and NOMML—with NOMML showing perfect separation on ground-truth data and enabling the creation of a large expressiveness subset (1,655,649 tracks). They provide extensive dataset statistics, standardized preprocessing, and a public HuggingFace release to support MIR and symbolic music research. The work highlights practical impacts for symbolic music generation, data mining, and digital musicology, while acknowledging biases and limitations in ground-truth coverage and instrument representation. Overall, NOMML emerges as a robust, scalable metric for expressive performance detection across GM instruments, facilitating future studies in expressive generation and analysis within large symbolic corpora.
Abstract
The Musical Instrument Digital Interface (MIDI), introduced in 1983, revolutionized music production by allowing computers and instruments to communicate efficiently. MIDI files encode musical instructions compactly, facilitating convenient music sharing. They benefit Music Information Retrieval (MIR), aiding in research on music understanding, computational musicology, and generative music. The GigaMIDI dataset contains over 1.4 million unique MIDI files, encompassing 1.8 billion MIDI note events and over 5.3 million MIDI tracks. GigaMIDI is currently the largest collection of symbolic music in MIDI format available for research purposes under fair dealing. Distinguishing between non-expressive and expressive MIDI tracks is challenging, as MIDI files do not inherently make this distinction. To address this issue, we introduce a set of innovative heuristics for detecting expressive music performance. These include the Distinctive Note Velocity Ratio (DNVR) heuristic, which analyzes MIDI note velocity; the Distinctive Note Onset Deviation Ratio (DNODR) heuristic, which examines deviations in note onset times; and the Note Onset Median Metric Level (NOMML) heuristic, which evaluates onset positions relative to metric levels. Our evaluation demonstrates these heuristics effectively differentiate between non-expressive and expressive MIDI tracks. Furthermore, after evaluation, we create the most substantial expressive MIDI dataset, employing our heuristic, NOMML. This curated iteration of GigaMIDI encompasses expressively-performed instrument tracks detected by NOMML, containing all General MIDI instruments, constituting 31% of the GigaMIDI dataset, totalling 1,655,649 tracks.
