Table of Contents
Fetching ...

MurkySky: Analyzing News Reliability on Bluesky

Vikas Reddy, Giovanni Luca Ciampaglia

TL;DR

The paper addresses how reliable news content is on Bluesky, a decentralized platform, by introducing MurkySky, a public tool that leverages Bluesky's Firehose and NewsGuard source ratings. It provides absolute and relative visualizations of news-link reliability, plus analyses of hashtag co-occurrence, audience segmentation via k-core on the like/repost network, and political orientation of sources. Across a June–August 2024 snapshot, unreliable sources are rare (~2%), with reliable content predominantly left-leaning, and distinct topic clusters associated with unreliable content. The work offers a replicable framework for monitoring information quality on decentralized social media and contributes practical insights for researchers, platform operators, and policymakers concerned with civic discourse on Bluesky.

Abstract

Bluesky has recently emerged as a lively competitor to Twitter/X for a platform for public discourse and news sharing. Most of the research on Bluesky so far has focused on characterizing its adoption due to migration. There has been less interest on characterizing the properties of Bluesky as a platform for news sharing and discussion, and in particular the prevalence of unreliable information on it. To fill this gap, this research provides the first comprehensive analysis of news reliability on Bluesky. We introduce MurkySky, a public tool to track the prevalence of content from unreliable news sources on Bluesky. Using firehose data from the summer of 2024, we find that on Bluesky reliable-source news content is prevalent, and largely originating from left-leaning sources. Content from unreliable news sources, while accounting for a small fraction of all news-linking posts, tends to originate from more partisan sources, but largely reflects the left-leaning skew of the platform. Analysis of the language and hashtags used in news-linking posts shows that unreliable-source content concentrates on specific topics of discussion.

MurkySky: Analyzing News Reliability on Bluesky

TL;DR

The paper addresses how reliable news content is on Bluesky, a decentralized platform, by introducing MurkySky, a public tool that leverages Bluesky's Firehose and NewsGuard source ratings. It provides absolute and relative visualizations of news-link reliability, plus analyses of hashtag co-occurrence, audience segmentation via k-core on the like/repost network, and political orientation of sources. Across a June–August 2024 snapshot, unreliable sources are rare (~2%), with reliable content predominantly left-leaning, and distinct topic clusters associated with unreliable content. The work offers a replicable framework for monitoring information quality on decentralized social media and contributes practical insights for researchers, platform operators, and policymakers concerned with civic discourse on Bluesky.

Abstract

Bluesky has recently emerged as a lively competitor to Twitter/X for a platform for public discourse and news sharing. Most of the research on Bluesky so far has focused on characterizing its adoption due to migration. There has been less interest on characterizing the properties of Bluesky as a platform for news sharing and discussion, and in particular the prevalence of unreliable information on it. To fill this gap, this research provides the first comprehensive analysis of news reliability on Bluesky. We introduce MurkySky, a public tool to track the prevalence of content from unreliable news sources on Bluesky. Using firehose data from the summer of 2024, we find that on Bluesky reliable-source news content is prevalent, and largely originating from left-leaning sources. Content from unreliable news sources, while accounting for a small fraction of all news-linking posts, tends to originate from more partisan sources, but largely reflects the left-leaning skew of the platform. Analysis of the language and hashtags used in news-linking posts shows that unreliable-source content concentrates on specific topics of discussion.
Paper Structure (18 sections, 2 equations, 4 figures, 1 table)

This paper contains 18 sections, 2 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: Left: Hourly total counts of reliable, unreliable, and total news links on Bluesky. Right: Proportion of unreliable news links relative to total news links on Bluesky.
  • Figure 2: Hashtag co-occurrence for different $k$-core networks ($k=11, 16, 21, 26$). Node color indicates the reliability of news links shared with each hashtag (Yellow = unreliable, Purple = Reliable). Node position computed with a force-directed layout.
  • Figure 3: Word clouds for each modularity class from the max-$k$-core of the likes and repost network for posts with links to news sources. The size of each word is proportional to its log-odds ratio (see Eq. \ref{['eq:logodds']}).
  • Figure 4: Left: Content prevalence, by source orientation and reliability. Right: Rank--Frequency plots of news source popularity