Table of Contents
Fetching ...

Link Me Baby One More Time: Social Music Discovery on Spotify

Shazia'Ayn Babul, Desislava Hristova, Antonio Lima, Renaud Lambiotte, Mariano Beguerisse-Díaz

TL;DR

This work analyzes how social factors influence short-term music discovery on Spotify by examining explicit link-share events within a large, multiplex social network. The authors combine user/artist embeddings, track-level context, and rich tie-strength signals to test hypotheses about taste similarity, tie strength, and social cohesion, and they train a Random Forest classifier to predict post-share engagement with an artist. Key contributions include quantifying the role of taste alignment, direct versus broadcast sharing, and peer influence in driving activation, and delivering a predictive model with ROC-AUC around 0.73 that highlights the most predictive feature groups. The findings offer practical insights into designing socially informed music recommendations and illuminate the mechanisms of social contagion and homophily in music discovery on streaming platforms.

Abstract

We explore the social and contextual factors that influence the outcome of person-to-person music recommendations and discovery. Specifically, we use data from Spotify to investigate how a link sent from one user to another results in the receiver engaging with the music of the shared artist. We consider several factors that may influence this process, such as the strength of the sender-receiver relationship, the user's role in the Spotify social network, their music social cohesion, and how similar the new artist is to the receiver's taste. We find that the receiver of a link is more likely to engage with a new artist when (1) they have similar music taste to the sender and the shared track is a good fit for their taste, (2) they have a stronger and more intimate tie with the sender, and (3) the shared artist is popular amongst the receiver's connections. Finally, we use these findings to build a Random Forest classifier to predict whether a shared music track will result in the receiver's engagement with the shared artist. This model elucidates which type of social and contextual features are most predictive, although peak performance is achieved when a diverse set of features are included. These findings provide new insights into the multifaceted mechanisms underpinning the interplay between music discovery and social processes.

Link Me Baby One More Time: Social Music Discovery on Spotify

TL;DR

This work analyzes how social factors influence short-term music discovery on Spotify by examining explicit link-share events within a large, multiplex social network. The authors combine user/artist embeddings, track-level context, and rich tie-strength signals to test hypotheses about taste similarity, tie strength, and social cohesion, and they train a Random Forest classifier to predict post-share engagement with an artist. Key contributions include quantifying the role of taste alignment, direct versus broadcast sharing, and peer influence in driving activation, and delivering a predictive model with ROC-AUC around 0.73 that highlights the most predictive feature groups. The findings offer practical insights into designing socially informed music recommendations and illuminate the mechanisms of social contagion and homophily in music discovery on streaming platforms.

Abstract

We explore the social and contextual factors that influence the outcome of person-to-person music recommendations and discovery. Specifically, we use data from Spotify to investigate how a link sent from one user to another results in the receiver engaging with the music of the shared artist. We consider several factors that may influence this process, such as the strength of the sender-receiver relationship, the user's role in the Spotify social network, their music social cohesion, and how similar the new artist is to the receiver's taste. We find that the receiver of a link is more likely to engage with a new artist when (1) they have similar music taste to the sender and the shared track is a good fit for their taste, (2) they have a stronger and more intimate tie with the sender, and (3) the shared artist is popular amongst the receiver's connections. Finally, we use these findings to build a Random Forest classifier to predict whether a shared music track will result in the receiver's engagement with the shared artist. This model elucidates which type of social and contextual features are most predictive, although peak performance is achieved when a diverse set of features are included. These findings provide new insights into the multifaceted mechanisms underpinning the interplay between music discovery and social processes.
Paper Structure (18 sections, 1 equation, 10 figures, 2 tables)

This paper contains 18 sections, 1 equation, 10 figures, 2 tables.

Figures (10)

  • Figure 1: A synthetic network representing the structure of Spotify social network. The Spotify social network is a multiplex network, each layer corresponding to a different social interaction. Social listening sessions and collaborative playlists represent undirected layers, while link sharing is a directed layer.
  • Figure 2: Distribution of the album release age at share-time for our link sharing events, separated by broadcast and direct application types.
  • Figure 3: User-artist engagement threshold choice. (a) User-artist engagement curves as a function of the number of tracks $n_{i,\alpha}(t)$ for different number of days listened $t_{\max}$ within a 7 day period. (b) Cumulative distribution function (ecdf) of the link shares and the associated receiver-artist engagement 7 days post share ($E_{i, \alpha}(t_{0}, 7)$. The orange line indicates our chosen threshold for a successful share event.
  • Figure 4: Distributions of music taste similarity for pairs of users who shared music in our sample (share) and a random permutation of those pairs (random).
  • Figure 5: Analysis of music taste similarity. The blue line shows the probability of becoming engaged with an artist as a function of the feature, where the shaded area is the $95\%$ confidence interval (main y-axis). The histograms give the distribution of the feature in our sample in the second y-axis (Frequency).
  • ...and 5 more figures