Music Genre Classification: Training an AI model

Keoikantse Mogonediwa

Music Genre Classification: Training an AI model

Keoikantse Mogonediwa

TL;DR

The paper investigates music genre classification by comparing four algorithms—Multilayer Perceptron (MLP), Convolutional Neural Network (CNN), K-Nearest Neighbours (KNN), and a Random Forest wide model—using features extracted from audio signals via Short-Time Fourier Transform and MFCCs on the GTZAN dataset. It finds that the Random Forest approach yields the highest accuracy (~84%), while CNN, MLP, and KNN show substantially lower performance under the reported configurations. The study discusses data quality issues, including a couple of corrupted jazz files and potential class imbalance, and emphasizes the role of STFT-based features in enabling effective classification. The results have practical implications for building robust, audio-based genre classifiers in streaming and music information retrieval systems.

Abstract

Music genre classification is an area that utilizes machine learning models and techniques for the processing of audio signals, in which applications range from content recommendation systems to music recommendation systems. In this research I explore various machine learning algorithms for the purpose of music genre classification, using features extracted from audio signals.The systems are namely, a Multilayer Perceptron (built from scratch), a k-Nearest Neighbours (also built from scratch), a Convolutional Neural Network and lastly a Random Forest wide model. In order to process the audio signals, feature extraction methods such as Short-Time Fourier Transform, and the extraction of Mel Cepstral Coefficients (MFCCs), is performed. Through this extensive research, I aim to asses the robustness of machine learning models for genre classification, and to compare their results.

Music Genre Classification: Training an AI model

TL;DR

Abstract

Paper Structure (15 sections, 10 figures)

This paper contains 15 sections, 10 figures.

Introduction
Is the problem solved?
A possible solution
Intended Experiment methods
Data Pre-Processing
Baseline Multilayer Perceptron
Model Specifications
Deep Neural Network: CNN
Model Specifications and results
K-Nearest Neighbours From Scratch
Model Specifications and results
Random Forest Wide model
Model Specifications and results
Comparisons
Conclusion

Figures (10)

Figure 1: Typical flowchart demonstrating the steps involved in music genre classification
Figure 2: Metadata information retrieved from the dataset.
Figure 3: STFT of a randomly selected Reggae audio file
Figure 4: STFT of a randomly selected Reggae audio file in which padding has been applied
Figure 5: Spectrogram of the selected reggae genre audio file
...and 5 more figures

Music Genre Classification: Training an AI model

TL;DR

Abstract

Music Genre Classification: Training an AI model

Authors

TL;DR

Abstract

Table of Contents

Figures (10)