MIDI-to-Tab: Guitar Tablature Inference via Masked Language Modeling
Drew Edwards, Xavier Riley, Pedro Sarmento, Simon Dixon
TL;DR
The paper tackles the problem of converting symbolic guitar scores into guitar tablature by assigning each note to a string and fret, a task complicated by multiple valid positions. It introduces a Transformer-based masked language modeling approach (BART-style encoder–decoder) with a Structured MidiTok tokenization to predict per-note string assignments, trained in two phases on large-scale DadaGP tablatures and then fine-tuned on professional performances. Quantitative results show high next-note accuracy and strong autoregressive agreement, while a user study with 15 guitarists demonstrates that the system's tablatures are often preferred over commercial tools, indicating practical playability improvements. The work advances automatic tablature inference without audio or video cues and points to future enhancements in tunings, articulations, and more physics-informed post-processing.
Abstract
Guitar tablatures enrich the structure of traditional music notation by assigning each note to a string and fret of a guitar in a particular tuning, indicating precisely where to play the note on the instrument. The problem of generating tablature from a symbolic music representation involves inferring this string and fret assignment per note across an entire composition or performance. On the guitar, multiple string-fret assignments are possible for most pitches, which leads to a large combinatorial space that prevents exhaustive search approaches. Most modern methods use constraint-based dynamic programming to minimize some cost function (e.g.\ hand position movement). In this work, we introduce a novel deep learning solution to symbolic guitar tablature estimation. We train an encoder-decoder Transformer model in a masked language modeling paradigm to assign notes to strings. The model is first pre-trained on DadaGP, a dataset of over 25K tablatures, and then fine-tuned on a curated set of professionally transcribed guitar performances. Given the subjective nature of assessing tablature quality, we conduct a user study amongst guitarists, wherein we ask participants to rate the playability of multiple versions of tablature for the same four-bar excerpt. The results indicate our system significantly outperforms competing algorithms.
