Detecting Notational Errors in Digital Music Scores
Géré Léo, Nicolas Audebert, Florent Jacquemard
TL;DR
This work tackles the challenge of notational errors in digital music scores by introducing a two-step, modular error-detection pipeline. The first step enforces rhythm/time consistency within MusicXML, while the second step applies format-agnostic, rule-based contextual validation via a tokenized state machine. Applied to the ASAP piano-score dataset, the approach identified notational errors in about 42% of scores and facilitated manual fixes to improve overall data quality, demonstrating the method's practicality for data curation in music information retrieval. The framework is designed to be extensible to additional formats and rules, offering a foundation for more comprehensivescore validation beyond MusicXML.
Abstract
Music scores are used to precisely store music pieces for transmission and preservation. To represent and manipulate these complex objects, various formats have been tailored for different use cases. While music notation follows specific rules, digital formats usually enforce them leniently. Hence, digital music scores widely vary in quality, due to software and format specificity, conversion issues, and dubious user inputs. Problems range from minor engraving discrepancies to major notation mistakes. Yet, data quality is a major issue when dealing with musical information extraction and retrieval. We present an automated approach to detect notational errors, aiming at precisely localizing defects in scores. We identify two types of errors: i) rhythm/time inconsistencies in the encoding of individual musical elements, and ii) contextual errors, i.e. notation mistakes that break commonly accepted musical rules. We implement the latter using a modular state machine that can be easily extended to include rules representing the usual conventions from the common Western music notation. Finally, we apply this error-detection method to the piano score dataset ASAP. We highlight that around 40% of the scores contain at least one notational error, and manually fix multiple of them to enhance the dataset's quality.
