Rage Music Classification and Analysis using K-Nearest Neighbour, Random Forest, Support Vector Machine, Convolutional Neural Networks, and Gradient Boosting
Akul Kumar
TL;DR
This work tackles the challenge of classifying rage music, a debated subgenre with subtle boundary cases. It benchmarks five classifiers (RF, SVM, KNN, CNNs, and Gradient Boosting) on a comprehensive audio feature set derived with librosa from 1236 tracks. Key contributions include identifying top predictive features (including song length, harmonic and percussive ratios, chroma mean, and MFCC3) and showing KNN as the strongest performer, with non-linear methods offering benefits. The study provides insights into the tempo threshold around 151 BPM and the central role of onset density, and highlights calibration and manifold-learning opportunities to improve deployment in music information retrieval.
Abstract
We classify rage music (a subgenre of rap well-known for disagreements on whether a particular song is part of the genre) with an extensive feature set through algorithms including Random Forest, Support Vector Machine, K-nearest Neighbour, Gradient Boosting, and Convolutional Neural Networks. We compare methods of classification in the application of audio analysis with machine learning and identify optimal models. We then analyze the significant audio features present in and most effective in categorizing rage music, while also identifying key audio features as well as broader separating sonic variations and trends.
