Tuning Into Bias: A Computational Study of Gender Bias in Song Lyrics
Danqing Chen, Adithi Satish, Rasul Khanbayov, Carolin M. Schuster, Georg Groh
TL;DR
This paper tackles the problem of quantifying gender bias in English song lyrics by combining topic modeling with bias measurement. It uses BERTopic to cluster 537,553 lyrics into topics and SC-WEAT on Word2Vec embeddings trained per genre and per top topic to quantify gender associations within those themes. The key contributions include per-topic and per-genre bias analyses revealing a shift from romantic to sexualized content over decades, and systematic biases where Intelligence and Strength lean male while Appearance and Weakness align with female associations. These findings highlight how thematic content and genre context shape gender stereotypes in lyrics, offering a computational lens for Digital Humanities and sociolinguistic interpretation with potential implications for media studies and cultural analysis.
Abstract
The application of text mining methods is becoming increasingly prevalent, particularly within Humanities and Computational Social Sciences, as well as in a broader range of disciplines. This paper presents an analysis of gender bias in English song lyrics using topic modeling and bias measurement techniques. Leveraging BERTopic, we cluster a dataset of 537,553 English songs into distinct topics and analyze their temporal evolution. Our results reveal a significant thematic shift in song lyrics over time, transitioning from romantic themes to a heightened focus on the sexualization of women. Additionally, we observe a substantial prevalence of profanity and misogynistic content across various topics, with a particularly high concentration in the largest thematic cluster. To further analyse gender bias across topics and genres in a quantitative way, we employ the Single Category Word Embedding Association Test (SC-WEAT) to calculate bias scores for word embeddings trained on the most prominent topics as well as individual genres. The results indicate a consistent male bias in words associated with intelligence and strength, while appearance and weakness words show a female bias. Further analysis highlights variations in these biases across topics, illustrating the interplay between thematic content and gender stereotypes in song lyrics.
