Chord-conditioned Melody and Bass Generation
Alexandra C Salem, Mohammad Shokri, Johanna Devaney
TL;DR
The paper tackles chord-conditioned generation of polyphonic melody and bass in Classical-style music and evaluates five Transformer-based strategies on the TAVERN corpus using music-theory metrics for pitch content, intervallic movement, and chord-tone usage. It introduces five generation variants—No Chord, Chord Independent, Chord Bass-1st, Chord Mel-1st, and Chord Co-Gen—along with REMI tokenization and Harte-style chord representations, and systematically compares them against ground-truth distributions. The results show that bass-first chord-conditioned generation (Chord Bass-1st) most faithfully replicates ground-truth pitch content and chord-tone usage, while independent, melody-first, and co-generation approaches perform more weakly; the no-chord baseline is the least aligned. The work advances pedagogy-oriented music-generation with objective theory-based evaluation and points to future directions including refined conditioning schemes and expert qualitative assessments.
Abstract
We evaluate five Transformer-based strategies for chord-conditioned melody and bass generation using a set of music theory-motivated metrics capturing pitch content, pitch interval size, and chord tone usage. The evaluated models include (1) no chord conditioning, (2) independent line chord-conditioned generation, (3) bass-first chord-conditioned generation, (4) melody-first chord-conditioned generation, and (5) chord-conditioned co-generation. We show that chord-conditioning improves the replication of stylistic pitch content and chord tone usage characteristics, particularly for the bass-first model.
