Table of Contents
Fetching ...

Matchmaker: An Open-source Library for Real-time Piano Score Following and Systematic Evaluation

Jiyun Park, Carlos Cancino-Chacón, Suhit Chiruthapudi, Juhan Nam

TL;DR

Matchmaker introduces an open-source Python library for real-time piano score following and benchmarking, addressing the lack of a unified framework for comparing models across diverse representations and algorithms. It organizes score following around input representations, features, and online alignment algorithms, and provides live streaming and simulation modes to enable reproducible benchmarking on large piano datasets. Empirical results show that online time warping variants (especially OLTWArzt) achieve superior accuracy and coverage, with onset-sensitive features like LSE offering favorable latency and MAE trade-offs. The framework enables practical deployment in interactive music systems and provides a scalable benchmark platform for future developments across instruments and modalities.

Abstract

Real-time music alignment, also known as score following, is a fundamental MIR task with a long history and is essential for many interactive applications. Despite its importance, there has not been a unified open framework for comparing models, largely due to the inherent complexity of real-time processing and the language- or system-dependent implementations. In addition, low compatibility with the existing MIR environment has made it difficult to develop benchmarks using large datasets available in recent years. While new studies based on established methods (e.g., dynamic programming, probabilistic models) have emerged, most evaluations compare models only within the same family or on small sets of test data. This paper introduces Matchmaker, an open-source Python library for real-time music alignment that is easy to use and compatible with modern MIR libraries. Using this, we systematically compare methods along two dimensions: music representations and alignment methods. We evaluated our approach on a large test set of solo piano music from the (n)ASAP, Batik, and Vienna4x22 datasets with a comprehensive set of metrics to ensure robust assessment. Our work aims to establish a benchmark framework for score-following research while providing a practical tool that developers can easily integrate into their applications.

Matchmaker: An Open-source Library for Real-time Piano Score Following and Systematic Evaluation

TL;DR

Matchmaker introduces an open-source Python library for real-time piano score following and benchmarking, addressing the lack of a unified framework for comparing models across diverse representations and algorithms. It organizes score following around input representations, features, and online alignment algorithms, and provides live streaming and simulation modes to enable reproducible benchmarking on large piano datasets. Empirical results show that online time warping variants (especially OLTWArzt) achieve superior accuracy and coverage, with onset-sensitive features like LSE offering favorable latency and MAE trade-offs. The framework enables practical deployment in interactive music systems and provides a scalable benchmark platform for future developments across instruments and modalities.

Abstract

Real-time music alignment, also known as score following, is a fundamental MIR task with a long history and is essential for many interactive applications. Despite its importance, there has not been a unified open framework for comparing models, largely due to the inherent complexity of real-time processing and the language- or system-dependent implementations. In addition, low compatibility with the existing MIR environment has made it difficult to develop benchmarks using large datasets available in recent years. While new studies based on established methods (e.g., dynamic programming, probabilistic models) have emerged, most evaluations compare models only within the same family or on small sets of test data. This paper introduces Matchmaker, an open-source Python library for real-time music alignment that is easy to use and compatible with modern MIR libraries. Using this, we systematically compare methods along two dimensions: music representations and alignment methods. We evaluated our approach on a large test set of solo piano music from the (n)ASAP, Batik, and Vienna4x22 datasets with a comprehensive set of metrics to ensure robust assessment. Our work aims to establish a benchmark framework for score-following research while providing a practical tool that developers can easily integrate into their applications.

Paper Structure

This paper contains 20 sections, 1 equation, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Overview of the score following package
  • Figure 2: A code example for running the Matchmaker in a live streaming mode.
  • Figure 3: Two examples of error calculation using the mapping function. (a) shows a one-to-many alignment at the evaluation point, while (b) illustrates a skipped alignment.
  • Figure 4: Defined delay types of the system. Only system delay is considered in the experiment.
  • Figure 5: A scatter plot of mean absolute error (MAE) and Henle's difficulty level in (n)ASAP and Batik dataset. The MAE results are from OLTWArzt.
  • ...and 2 more figures