A New Dataset, Notation Software, and Representation for Computational Schenkerian Analysis
Stephen Ni-Hahn, Weihan Xu, Jerry Yin, Rico Zhu, Simon Mak, Yue Jiang, Cynthia Rudin
TL;DR
This work tackles the shortage of large, high-quality, machine-readable Schenkerian data by introducing a growing dataset of SchA analyses, a data-collection/notation tool, and a heterogeneous graph representation for SchA. The dataset contains over 140 excerpts (145 analyses across multiple analysts), spanning diverse composers, and is designed to grow over time. The notation tool provides an accessible JSON-based encoding with cross-notation interoperability, while the graph formulation enables flexible modeling of multivoice SchA and clustering-based analysis. Collectively, these contributions enable data-driven exploration of SchA for music information retrieval and generation tasks, and they pave the way for learning complex hierarchical musical structure.
Abstract
Schenkerian Analysis (SchA) is a uniquely expressive method of music analysis, combining elements of melody, harmony, counterpoint, and form to describe the hierarchical structure supporting a work of music. However, despite its powerful analytical utility and potential to improve music understanding and generation, SchA has rarely been utilized by the computer music community. This is in large part due to the paucity of available high-quality data in a computer-readable format. With a larger corpus of Schenkerian data, it may be possible to infuse machine learning models with a deeper understanding of musical structure, thus leading to more "human" results. To encourage further research in Schenkerian analysis and its potential benefits for music informatics and generation, this paper presents three main contributions: 1) a new and growing dataset of SchAs, the largest in human- and computer-readable formats to date (>140 excerpts), 2) a novel software for visualization and collection of SchA data, and 3) a novel, flexible representation of SchA as a heterogeneous-edge graph data structure.
