Table of Contents
Fetching ...

A Machine Learning Approach for MIDI to Guitar Tablature Conversion

Maximos Kaliakatsos-Papakostas, Gregoris Bastas, Dimos Makris, Dorien Herremans, Vassilis Katsouros, Petros Maragos

TL;DR

The paper tackles translating MIDI note sets into guitar tablature while enforcing playability constraints for a standard six-string guitar. It presents a two-stage method: a Probabilistic Fretboard Deep Neural Network outputs a $6\times25$ fretboard probability map from the current MIDI frame plus prior frames, followed by a fingering selection that maximizes $p^\top b$ under playability rules. Using the DadaGP GP-tablature dataset, it shows that data augmentation—injecting additional pitches to mimic non-guitar MIDI—generally improves performance, even for simple monophonic cases. The study also identifies limitations of the greedy analytic step and the lack of future-context modeling, proposing bidirectional architectures and octave-transposition handling as avenues for improvement.

Abstract

Guitar tablature transcription consists in deducing the string and the fret number on which each note should be played to reproduce the actual musical part. This assignment should lead to playable string-fret combinations throughout the entire track and, in general, preserve parsimonious motion between successive combinations. Throughout the history of guitar playing, specific chord fingerings have been developed across different musical styles that facilitate common idiomatic voicing combinations and motion between them. This paper presents a method for assigning guitar tablature notation to a given MIDI-based musical part (possibly consisting of multiple polyphonic tracks), i.e. no information about guitar-idiomatic expressional characteristics is involved (e.g. bending etc.) The current strategy is based on machine learning and requires a basic assumption about how much fingers can stretch on a fretboard; only standard 6-string guitar tuning is examined. The proposed method also examines the transcription of music pieces that was not meant to be played or could not possibly be played by a guitar (e.g. potentially a symphonic orchestra part), employing a rudimentary method for augmenting musical information and training/testing the system with artificial data. The results present interesting aspects about what the system can achieve when trained on the initial and augmented dataset, showing that the training with augmented data improves the performance even in simple, e.g. monophonic, cases. Results also indicate weaknesses and lead to useful conclusions about possible improvements.

A Machine Learning Approach for MIDI to Guitar Tablature Conversion

TL;DR

The paper tackles translating MIDI note sets into guitar tablature while enforcing playability constraints for a standard six-string guitar. It presents a two-stage method: a Probabilistic Fretboard Deep Neural Network outputs a fretboard probability map from the current MIDI frame plus prior frames, followed by a fingering selection that maximizes under playability rules. Using the DadaGP GP-tablature dataset, it shows that data augmentation—injecting additional pitches to mimic non-guitar MIDI—generally improves performance, even for simple monophonic cases. The study also identifies limitations of the greedy analytic step and the lack of future-context modeling, proposing bidirectional architectures and octave-transposition handling as avenues for improvement.

Abstract

Guitar tablature transcription consists in deducing the string and the fret number on which each note should be played to reproduce the actual musical part. This assignment should lead to playable string-fret combinations throughout the entire track and, in general, preserve parsimonious motion between successive combinations. Throughout the history of guitar playing, specific chord fingerings have been developed across different musical styles that facilitate common idiomatic voicing combinations and motion between them. This paper presents a method for assigning guitar tablature notation to a given MIDI-based musical part (possibly consisting of multiple polyphonic tracks), i.e. no information about guitar-idiomatic expressional characteristics is involved (e.g. bending etc.) The current strategy is based on machine learning and requires a basic assumption about how much fingers can stretch on a fretboard; only standard 6-string guitar tuning is examined. The proposed method also examines the transcription of music pieces that was not meant to be played or could not possibly be played by a guitar (e.g. potentially a symphonic orchestra part), employing a rudimentary method for augmenting musical information and training/testing the system with artificial data. The results present interesting aspects about what the system can achieve when trained on the initial and augmented dataset, showing that the training with augmented data improves the performance even in simple, e.g. monophonic, cases. Results also indicate weaknesses and lead to useful conclusions about possible improvements.

Paper Structure

This paper contains 7 sections, 2 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Examined architecture.
  • Figure 2: Examined architecture.
  • Figure 3: Data preparation overview.
  • Figure 4: Training losses for all training epochs in the guitar-only and augmented transcription problems (see Table \ref{['tab:epochs']} for optimal conditions).
  • Figure 5: Example that indicates the possible usefulness of incorporating future information.
  • ...and 1 more figures