On the Interplay between Musical Preferences and Personality through the Lens of Language
Eliran Shem-Tov, Ella Rabinovich
TL;DR
This study interrogates whether musical preferences are encoded in spontaneous language through the Big Five personality traits. It introduces GenBigFive, a large, LLM-generated corpus for trait-specific text data, and trains robust logistic-regression classifiers that predict five personality dimensions from embeddings. Applying these models to a Reddit-based dataset of nearly 5,000 users across five genres reveals significant, interpretable differences in personality profiles among genre fans, and modest but above-chance ability to predict genre from personality alone. The work provides open resources and demonstrates a scalable approach to integrating language, music psychology, and personality analysis with potential applications in personalization and sociolinguistics.
Abstract
Music serves as a powerful reflection of individual identity, often aligning with deeper psychological traits. Prior research has established correlations between musical preferences and personality, while separate studies have demonstrated that personality is detectable through linguistic analysis. Our study bridges these two research domains by investigating whether individuals' musical preferences leave traces in their spontaneous language through the lens of the Big Five personality traits (Openness, Conscientiousness, Extroversion, Agreeableness, and Neuroticism). Using a carefully curated dataset of over 500,000 text samples from nearly 5,000 authors with reliably identified musical preferences, we build advanced models to assess personality characteristics. Our results reveal significant personality differences across fans of five musical genres. We release resources for future research at the intersection of computational linguistics, music psychology and personality analysis.
