Understanding the Algorithm Behind Audio Key Detection
Henrique Perez G. Silva
TL;DR
This work tackles automatic musical key detection from audio using a profile-matching approach. It introduces a framework that represents tonal content as a mean chroma vector $\bar{C}_{norm}$ derived from STFT-based chromagrams and compares it to 24 normalized key templates $P_{j,norm}$. The key decision is made by maximizing the cosine similarity corr_j = $\bar{C}_{norm} \cdot P_{j,norm}$ across the 24 keys. The approach supports practical use in music information retrieval, organization, recommendations, and harmonic-mixing tools for DJs.
Abstract
The determination of musical key is a fundamental aspect of music theory and perception, providing a harmonic context for melodies and chord progressions. Automating this process, known as automatic key detection, is a significant task in the field of Music Information Retrieval (MIR). This article outlines an algorithmic methodology for estimating the musical key of an audio recording by analyzing its tonal content through digital signal processing techniques and comparison with theoretical key profiles.
