Table of Contents
Fetching ...

BlueTempNet: A Temporal Multi-network Dataset of Social Interactions in Bluesky Social

Ujun Jeong, Bohan Jiang, Zhen Tan, H. Russell Bernard, Huan Liu

TL;DR

This work collects existing Bluesky Feeds, including the users who liked and generated these Feeds, and provides tools to gather users’ social interactions within a date range, which captures past user behaviors and supports the future data collection of user behavior.

Abstract

Decentralized social media platforms like Bluesky Social (Bluesky) have made it possible to publicly disclose some user behaviors with millisecond-level precision. Embracing Bluesky's principles of open-source and open-data, we present the first collection of the temporal dynamics of user-driven social interactions. BlueTempNet integrates multiple types of networks into a single multi-network, including user-to-user interactions (following and blocking users) and user-to-community interactions (creating and joining communities). Communities are user-formed groups in custom Feeds, where users subscribe to posts aligned with their interests. Following Bluesky's public data policy, we collect existing Bluesky Feeds, including the users who liked and generated these Feeds, and provide tools to gather users' social interactions within a date range. This data-collection strategy captures past user behaviors and supports the future data collection of user behavior.

BlueTempNet: A Temporal Multi-network Dataset of Social Interactions in Bluesky Social

TL;DR

This work collects existing Bluesky Feeds, including the users who liked and generated these Feeds, and provides tools to gather users’ social interactions within a date range, which captures past user behaviors and supports the future data collection of user behavior.

Abstract

Decentralized social media platforms like Bluesky Social (Bluesky) have made it possible to publicly disclose some user behaviors with millisecond-level precision. Embracing Bluesky's principles of open-source and open-data, we present the first collection of the temporal dynamics of user-driven social interactions. BlueTempNet integrates multiple types of networks into a single multi-network, including user-to-user interactions (following and blocking users) and user-to-community interactions (creating and joining communities). Communities are user-formed groups in custom Feeds, where users subscribe to posts aligned with their interests. Following Bluesky's public data policy, we collect existing Bluesky Feeds, including the users who liked and generated these Feeds, and provide tools to gather users' social interactions within a date range. This data-collection strategy captures past user behaviors and supports the future data collection of user behavior.
Paper Structure (17 sections, 1 equation, 7 figures, 9 tables)

This paper contains 17 sections, 1 equation, 7 figures, 9 tables.

Figures (7)

  • Figure 1: Overview of BlueTempNet, a multi-network integrating two categories of users' social interactions such as user-to-user (following, blocking) and user-to-community (creating, joining), all timestamped at $t_i$.
  • Figure 2: An example of a user's interactions on Bluesky. The left panel shows user-to-community interactions, including joining a community (purple dashed line) and a community creation (green dashed line). The right panel illustrates user-to-user interactions, the change in profiles after following (blue arrow) and blocking (red arrow) another user.
  • Figure 3: An illustration of our data collection pipeline for BlueTempNet is shown in three levels. Each stage of collection captures the timestamp of a user's interactions with millisecond-level precision in UTC format.
  • Figure 4: Scatter plot displaying the term frequency of terms in Feeds display names (y-axis) and their Pearson correlations with the number of likes received (x-axis). Scatter points are color-coded based on the direction of the correlation. We tagged the points for the top 10 highest and lowest Pearson correlations by displaying the corresponding terms.
  • Figure 5: Trends in the increase of new edges across three networks: $G_\mathcal{C}$, $G_\mathcal{M}$, and $G_\mathcal{A}$. The lines represent the number of new edges added on each date, while the shaded area indicates the cumulative distribution function (CDF) for each interaction type. The red dotted line marks Bluesky's transition to an invitation-free platform.
  • ...and 2 more figures