Table of Contents
Fetching ...

SocialPulse: An Open-Source Subreddit Sensemaking Toolkit

Stephanie Birkelbach, Maria Teleki, Peter Carragher, Xiangjue Dong, Nehul Bhatnagar, James Caverlee

TL;DR

The paper addresses the need for open, adaptable tools to analyze large-scale online discourse across participation levels. It presents SocialPulse, an open-source pipeline that unifies data ingestion, bot filtration, BERTopic-based topic modeling, VADER sentiment analysis, and cross-subreddit analytics within an interactive dashboard. A case study on r/conspiracy demonstrates diverse topics, temporal engagement patterns, and cross-community content diffusion with r/politics, illustrating the toolkit's capacity for multi-level sensemaking. The work emphasizes transparency and reproducibility, and outlines future enhancements such as cross-platform support and LLM-assisted summaries to further aid exploratory social science research.

Abstract

Understanding how online communities discuss and make sense of complex social issues is a central challenge in social media research, yet existing tools for large-scale discourse analysis are often closed-source, difficult to adapt, or limited to single analytical views. We present SocialPulse, an open-source subreddit sensemaking toolkit that unifies multiple complementary analyses -- topic modeling, sentiment analysis, user activity characterization, and bot detection -- within a single interactive system. SocialPulse enables users to fluidly move between aggregate trends and fine-grained content, compare highly active and long-tail contributors, and examine temporal shifts in discourse across subreddits. The demo showcases end-to-end exploratory workflows that allow researchers and practitioners to rapidly surface themes, participation patterns, and emerging dynamics in large Reddit datasets. By offering an extensible and openly available platform, SocialPulse provides a practical and reusable foundation for transparent, reproducible sensemaking of online community discourse.

SocialPulse: An Open-Source Subreddit Sensemaking Toolkit

TL;DR

The paper addresses the need for open, adaptable tools to analyze large-scale online discourse across participation levels. It presents SocialPulse, an open-source pipeline that unifies data ingestion, bot filtration, BERTopic-based topic modeling, VADER sentiment analysis, and cross-subreddit analytics within an interactive dashboard. A case study on r/conspiracy demonstrates diverse topics, temporal engagement patterns, and cross-community content diffusion with r/politics, illustrating the toolkit's capacity for multi-level sensemaking. The work emphasizes transparency and reproducibility, and outlines future enhancements such as cross-platform support and LLM-assisted summaries to further aid exploratory social science research.

Abstract

Understanding how online communities discuss and make sense of complex social issues is a central challenge in social media research, yet existing tools for large-scale discourse analysis are often closed-source, difficult to adapt, or limited to single analytical views. We present SocialPulse, an open-source subreddit sensemaking toolkit that unifies multiple complementary analyses -- topic modeling, sentiment analysis, user activity characterization, and bot detection -- within a single interactive system. SocialPulse enables users to fluidly move between aggregate trends and fine-grained content, compare highly active and long-tail contributors, and examine temporal shifts in discourse across subreddits. The demo showcases end-to-end exploratory workflows that allow researchers and practitioners to rapidly surface themes, participation patterns, and emerging dynamics in large Reddit datasets. By offering an extensible and openly available platform, SocialPulse provides a practical and reusable foundation for transparent, reproducible sensemaking of online community discourse.
Paper Structure (13 sections, 2 figures, 2 tables)

This paper contains 13 sections, 2 figures, 2 tables.

Figures (2)

  • Figure 1: The SocialPulse pipeline supports rapid exploratory data analysis and sensemaking of Reddit communities; implementation details are provided in Table \ref{['tab:links']}.
  • Figure 2: The SocialPulse analytics interface supports rapid, multi-level sensemaking of Reddit discourse. The interface enables (a, top left) interactive topic exploration, (b, top right) temporal and sentiment analysis, (c, bottom left) topic-specific analysis, and (d, bottom right) cross-subreddit comparison, illustrated here for r/conspiracy and r/politics.