Table of Contents
Fetching ...

Vocal Melody Construction for Persian Lyrics Using LSTM Recurrent Neural Networks

Farshad Jafari, Farzad Didehvar, Amin Gheibi

TL;DR

A seq2seq neural network was developed, trained on parallel syllable and note sequences in Persian songs to suggest a pleasant melody for a new sequence of syllables, and it was assumed that there is a phonological correlation between the lyric syllables and the melody in a song.

Abstract

The present paper investigated automatic melody construction for Persian lyrics as an input. It was assumed that there is a phonological correlation between the lyric syllables and the melody in a song. A seq2seq neural network was developed to investigate this assumption, trained on parallel syllable and note sequences in Persian songs to suggest a pleasant melody for a new sequence of syllables. More than 100 pieces of Persian music were collected and converted from the printed version to the digital format due to the lack of a dataset on Persian digital music. Finally, 14 new lyrics were given to the model as input, and the suggested melodies were performed and recorded by music experts to evaluate the trained model. The evaluation was conducted using an audio questionnaire, which more than 170 persons answered. According to the answers about the pleasantness of melody, the system outputs scored an average of 3.005 from 5, while the human-made melodies for the same lyrics obtained an average score of 4.078.

Vocal Melody Construction for Persian Lyrics Using LSTM Recurrent Neural Networks

TL;DR

A seq2seq neural network was developed, trained on parallel syllable and note sequences in Persian songs to suggest a pleasant melody for a new sequence of syllables, and it was assumed that there is a phonological correlation between the lyric syllables and the melody in a song.

Abstract

The present paper investigated automatic melody construction for Persian lyrics as an input. It was assumed that there is a phonological correlation between the lyric syllables and the melody in a song. A seq2seq neural network was developed to investigate this assumption, trained on parallel syllable and note sequences in Persian songs to suggest a pleasant melody for a new sequence of syllables. More than 100 pieces of Persian music were collected and converted from the printed version to the digital format due to the lack of a dataset on Persian digital music. Finally, 14 new lyrics were given to the model as input, and the suggested melodies were performed and recorded by music experts to evaluate the trained model. The evaluation was conducted using an audio questionnaire, which more than 170 persons answered. According to the answers about the pleasantness of melody, the system outputs scored an average of 3.005 from 5, while the human-made melodies for the same lyrics obtained an average score of 4.078.

Paper Structure

This paper contains 23 sections, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Architecture of seq2seq model
  • Figure 2: A note sheet example
  • Figure 3: An example of segmenting the musical sentences by silences in melody
  • Figure 4: The average score given by the audience to the pleasantness of the generated melody (G) and the human-made melody (O) from one to five.
  • Figure 5: The average score of the pleasantness of each piece
  • ...and 2 more figures