Table of Contents
Fetching ...

Audio Processing using Pattern Recognition for Music Genre Classification

Sivangi Chatterjee, Srishti Ganguly, Avik Bose, Hrithik Raj Prasad, Arijit Ghosal

TL;DR

This project explores the application of machine learning techniques for music genre classification using the GTZAN dataset, which contains 100 audio files per genre using a variety of algorithms including Logistic Regression, K-Nearest Neighbors (KNN), Random Forest, and Artificial Neural Networks (ANN) implemented via Keras.

Abstract

This project explores the application of machine learning techniques for music genre classification using the GTZAN dataset, which contains 100 audio files per genre. Motivated by the growing demand for personalized music recommendations, we focused on classifying five genres-Blues, Classical, Jazz, Hip Hop, and Country-using a variety of algorithms including Logistic Regression, K-Nearest Neighbors (KNN), Random Forest, and Artificial Neural Networks (ANN) implemented via Keras. The ANN model demonstrated the best performance, achieving a validation accuracy of 92.44%. We also analyzed key audio features such as spectral roll-off, spectral centroid, and MFCCs, which helped enhance the model's accuracy. Future work will expand the model to cover all ten genres, investigate advanced methods like Long Short-Term Memory (LSTM) networks and ensemble approaches, and develop a web application for real-time genre classification and playlist generation. This research aims to contribute to improving music recommendation systems and content curation.

Audio Processing using Pattern Recognition for Music Genre Classification

TL;DR

This project explores the application of machine learning techniques for music genre classification using the GTZAN dataset, which contains 100 audio files per genre using a variety of algorithms including Logistic Regression, K-Nearest Neighbors (KNN), Random Forest, and Artificial Neural Networks (ANN) implemented via Keras.

Abstract

This project explores the application of machine learning techniques for music genre classification using the GTZAN dataset, which contains 100 audio files per genre. Motivated by the growing demand for personalized music recommendations, we focused on classifying five genres-Blues, Classical, Jazz, Hip Hop, and Country-using a variety of algorithms including Logistic Regression, K-Nearest Neighbors (KNN), Random Forest, and Artificial Neural Networks (ANN) implemented via Keras. The ANN model demonstrated the best performance, achieving a validation accuracy of 92.44%. We also analyzed key audio features such as spectral roll-off, spectral centroid, and MFCCs, which helped enhance the model's accuracy. Future work will expand the model to cover all ten genres, investigate advanced methods like Long Short-Term Memory (LSTM) networks and ensemble approaches, and develop a web application for real-time genre classification and playlist generation. This research aims to contribute to improving music recommendation systems and content curation.

Paper Structure

This paper contains 21 sections, 5 equations, 11 figures, 1 table.

Figures (11)

  • Figure 1: Machine Learning Block Diagram
  • Figure 2: ZCR variation across five genres
  • Figure 3: Spectral Centroid variation across five genres
  • Figure 4: Spectral Roll-Off variation across five genres
  • Figure 5: MFCC variation across five genres
  • ...and 6 more figures