Table of Contents
Fetching ...

Raply: A profanity-mitigated rap generator

Omar Manil Bendali, Samir Ferroum, Ekaterina Kozachenko, Youssef Parviz, Hanna Shcharbakova, Anna Tokareva, Shemair Williams

TL;DR

Raply tackles profanity-mitigated rap lyric generation by fine-tuning GPT-2 on a dedicated Mitislurs corpus, created to reduce offensive language while preserving rhyming quality. The approach combines a profanity-aware data pipeline (The Obscenity List and severity-based slur scoring) with a phoneme-based rhyme assessment to optimize rhyming density. Empirical results show that training on Mitislurs improves rhyme density relative to a Slurs baseline and reduces profane content, though not eliminating it entirely, while DopeLearning remains a strong comparative baseline. The work also discusses ethical, environmental, and legal considerations relevant to data sourcing and model deployment in lyric generation.

Abstract

The task of writing rap is challenging and involves producing complex rhyming schemes, yet meaningful lyrics. In this work, we propose Raply, a fine-tuned GPT-2 model capable of producing meaningful rhyming text in the style of rap. In addition to its rhyming capabilities, the model is able to generate less offensive content. It was achieved through the fine-tuning the model on a new dataset Mitislurs, a profanity-mitigated corpus. We evaluate the output of the model on two criteria: 1) rhyming based on the rhyme density metric; 2) profanity content, using the list of profanities for the English language. To our knowledge, this is the first attempt at profanity mitigation for rap lyrics generation.

Raply: A profanity-mitigated rap generator

TL;DR

Raply tackles profanity-mitigated rap lyric generation by fine-tuning GPT-2 on a dedicated Mitislurs corpus, created to reduce offensive language while preserving rhyming quality. The approach combines a profanity-aware data pipeline (The Obscenity List and severity-based slur scoring) with a phoneme-based rhyme assessment to optimize rhyming density. Empirical results show that training on Mitislurs improves rhyme density relative to a Slurs baseline and reduces profane content, though not eliminating it entirely, while DopeLearning remains a strong comparative baseline. The work also discusses ethical, environmental, and legal considerations relevant to data sourcing and model deployment in lyric generation.

Abstract

The task of writing rap is challenging and involves producing complex rhyming schemes, yet meaningful lyrics. In this work, we propose Raply, a fine-tuned GPT-2 model capable of producing meaningful rhyming text in the style of rap. In addition to its rhyming capabilities, the model is able to generate less offensive content. It was achieved through the fine-tuning the model on a new dataset Mitislurs, a profanity-mitigated corpus. We evaluate the output of the model on two criteria: 1) rhyming based on the rhyme density metric; 2) profanity content, using the list of profanities for the English language. To our knowledge, this is the first attempt at profanity mitigation for rap lyrics generation.
Paper Structure (16 sections, 4 figures, 7 tables)

This paper contains 16 sections, 4 figures, 7 tables.

Figures (4)

  • Figure 1: The distribution of profanities by categories according to their severity ratings in The Obscenity List.
  • Figure 2: Distribution of profanities across matched categories in the raw dataset.
  • Figure 3: Distribution of slur scores in the raw dataset.
  • Figure 4: Distribution of profanities across matched categories in the filtered dataset.