Raply: A profanity-mitigated rap generator
Omar Manil Bendali, Samir Ferroum, Ekaterina Kozachenko, Youssef Parviz, Hanna Shcharbakova, Anna Tokareva, Shemair Williams
TL;DR
Raply tackles profanity-mitigated rap lyric generation by fine-tuning GPT-2 on a dedicated Mitislurs corpus, created to reduce offensive language while preserving rhyming quality. The approach combines a profanity-aware data pipeline (The Obscenity List and severity-based slur scoring) with a phoneme-based rhyme assessment to optimize rhyming density. Empirical results show that training on Mitislurs improves rhyme density relative to a Slurs baseline and reduces profane content, though not eliminating it entirely, while DopeLearning remains a strong comparative baseline. The work also discusses ethical, environmental, and legal considerations relevant to data sourcing and model deployment in lyric generation.
Abstract
The task of writing rap is challenging and involves producing complex rhyming schemes, yet meaningful lyrics. In this work, we propose Raply, a fine-tuned GPT-2 model capable of producing meaningful rhyming text in the style of rap. In addition to its rhyming capabilities, the model is able to generate less offensive content. It was achieved through the fine-tuning the model on a new dataset Mitislurs, a profanity-mitigated corpus. We evaluate the output of the model on two criteria: 1) rhyming based on the rhyme density metric; 2) profanity content, using the list of profanities for the English language. To our knowledge, this is the first attempt at profanity mitigation for rap lyrics generation.
