Exploring and Applying Audio-Based Sentiment Analysis in Music

Etash Jhanji

Exploring and Applying Audio-Based Sentiment Analysis in Music

Etash Jhanji

TL;DR

This study seeks to predict the emotion of a musical clip over time and determine the next emotion value after the music in a time series to ensure seamless transitions and trains models for both tasks.

Abstract

Sentiment analysis is a continuously explored area of text processing that deals with the computational analysis of opinions, sentiments, and subjectivity of text. However, this idea is not limited to text and speech, in fact, it could be applied to other modalities. In reality, humans do not express themselves in text as deeply as they do in music. The ability of a computational model to interpret musical emotions is largely unexplored and could have implications and uses in therapy and musical queuing. In this paper, two individual tasks are addressed. This study seeks to (1) predict the emotion of a musical clip over time and (2) determine the next emotion value after the music in a time series to ensure seamless transitions. Utilizing data from the Emotions in Music Database, which contains clips of songs selected from the Free Music Archive annotated with levels of valence and arousal as reported on Russel's circumplex model of affect by multiple volunteers, models are trained for both tasks. Overall, the performance of these models reflected that they were able to perform the tasks they were designed for effectively and accurately.

Exploring and Applying Audio-Based Sentiment Analysis in Music

TL;DR

Abstract

Paper Structure (18 sections, 7 figures, 2 tables)

This paper contains 18 sections, 7 figures, 2 tables.

Introduction
Background
Representing Emotion
Audio Processing
Long Short-Term Memory Models
Dataset openSMILE, 1000SongforEmotioninMusic
Task 1: Predicting Emotion
Task 2: Intelligent Queuing
Linear Regression Approach
Results and Conclusions
Task 1
Task 2
Discussion
Future Direction
Error Analysis
...and 3 more sections

Figures (7)

Figure 1: Russel's Circumplex Model of Affect
Figure 2: The full pipeline of audio processing including clipping, mel spectrogram, and storage format.
Figure 3: Loss graphs (MSE) for training and validation varied with number of epochs for task 1 using the most optimal hyperparameter found. Shows convergence for training but slight possible overfitting.
Figure 4: Loss graphs (MSE) for training and validation varied with number of epochs for task 2 using the most optimal hyperparameter found. The loss converges very quickly after the first two epochs.
Figure 5: Arousal and valence for one song plotted against time on the z-axis. Dotted lines represent the centerlines and edges of the circumplex model.
...and 2 more figures

Exploring and Applying Audio-Based Sentiment Analysis in Music

TL;DR

Abstract

Exploring and Applying Audio-Based Sentiment Analysis in Music

Authors

TL;DR

Abstract

Table of Contents

Figures (7)