Table of Contents
Fetching ...

DeepSRGM -- Sequence Classification and Ranking in Indian Classical Music with Deep Learning

Sathwik Tejaswi Madhusudhan, Girish Chowdhary

TL;DR

This work proposes a deep learning based approach to Raga recognition that employs efficient pre possessing and learns temporal sequences in music data using Long Short Term Memory based Recurrent Neural Networks (LSTM-RNN).

Abstract

A vital aspect of Indian Classical Music (ICM) is Raga, which serves as a melodic framework for compositions and improvisations alike. Raga Recognition is an important music information retrieval task in ICM as it can aid numerous downstream applications ranging from music recommendations to organizing huge music collections. In this work, we propose a deep learning based approach to Raga recognition. Our approach employs efficient pre possessing and learns temporal sequences in music data using Long Short Term Memory based Recurrent Neural Networks (LSTM-RNN). We train and test the network on smaller sequences sampled from the original audio while the final inference is performed on the audio as a whole. Our method achieves an accuracy of 88.1% and 97 % during inference on the Comp Music Carnatic dataset and its 10 Raga subset respectively making it the state-of-the-art for the Raga recognition task. Our approach also enables sequence ranking which aids us in retrieving melodic patterns from a given music data base that are closely related to the presented query sequence.

DeepSRGM -- Sequence Classification and Ranking in Indian Classical Music with Deep Learning

TL;DR

This work proposes a deep learning based approach to Raga recognition that employs efficient pre possessing and learns temporal sequences in music data using Long Short Term Memory based Recurrent Neural Networks (LSTM-RNN).

Abstract

A vital aspect of Indian Classical Music (ICM) is Raga, which serves as a melodic framework for compositions and improvisations alike. Raga Recognition is an important music information retrieval task in ICM as it can aid numerous downstream applications ranging from music recommendations to organizing huge music collections. In this work, we propose a deep learning based approach to Raga recognition. Our approach employs efficient pre possessing and learns temporal sequences in music data using Long Short Term Memory based Recurrent Neural Networks (LSTM-RNN). We train and test the network on smaller sequences sampled from the original audio while the final inference is performed on the audio as a whole. Our method achieves an accuracy of 88.1% and 97 % during inference on the Comp Music Carnatic dataset and its 10 Raga subset respectively making it the state-of-the-art for the Raga recognition task. Our approach also enables sequence ranking which aids us in retrieving melodic patterns from a given music data base that are closely related to the presented query sequence.
Paper Structure (29 sections, 7 equations, 6 figures, 3 tables)

This paper contains 29 sections, 7 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Figure shows various preprocessing steps and model architecture for SRGM1 (refer Section 3)
  • Figure 2: Schematic diagram for the sequence ranking algorithm. P, Q and R are the copies of the same model and hence have the same architecture.
  • Figure 3: Figure gives an overview of the inference process for SRGM1 as described in Section 5.3.1
  • Figure 4: Figure shows the confusion matrix for the predictions obtained using SRGM1.
  • Figure 5: The above graph depicts variation in the "training loss vs epochs" plot with changing subsequence length.
  • ...and 1 more figures