Table of Contents
Fetching ...

Tracking the 2024 US Presidential Election Chatter on Tiktok: A Public Multimodal Dataset

Gabriela Pinto, Charles Bickham, Tanishq Salkar, Luca Luceri, Emilio Ferrara

TL;DR

A large-scale data collection of TikTok posts related to the upcoming 2024 U.S. Presidential Election identifies the most common keywords, hashtags, and bigrams in both Spanish and English posts, focusing on the election and the two main Presidential candidates.

Abstract

This paper documents our release of a large-scale data collection of TikTok posts related to the upcoming 2024 U.S. Presidential Election. Our current data comprises 1.8 million videos published between November 1, 2023, and May 26, 2024. Its exploratory analysis identifies the most common keywords, hashtags, and bigrams in both Spanish and English posts, focusing on the election and the two main Presidential candidates, President Joe Biden and Donald Trump. We utilized the TikTok Research API, incorporating various election-related keywords and hashtags, to capture the full scope of relevant content. To address the limitations of the TikTok Research API, we also employed third-party scrapers to expand our dataset. The dataset is publicly available at https://github.com/gabbypinto/US2024PresElectionTikToks

Tracking the 2024 US Presidential Election Chatter on Tiktok: A Public Multimodal Dataset

TL;DR

A large-scale data collection of TikTok posts related to the upcoming 2024 U.S. Presidential Election identifies the most common keywords, hashtags, and bigrams in both Spanish and English posts, focusing on the election and the two main Presidential candidates.

Abstract

This paper documents our release of a large-scale data collection of TikTok posts related to the upcoming 2024 U.S. Presidential Election. Our current data comprises 1.8 million videos published between November 1, 2023, and May 26, 2024. Its exploratory analysis identifies the most common keywords, hashtags, and bigrams in both Spanish and English posts, focusing on the election and the two main Presidential candidates, President Joe Biden and Donald Trump. We utilized the TikTok Research API, incorporating various election-related keywords and hashtags, to capture the full scope of relevant content. To address the limitations of the TikTok Research API, we also employed third-party scrapers to expand our dataset. The dataset is publicly available at https://github.com/gabbypinto/US2024PresElectionTikToks
Paper Structure (1 section, 6 figures, 4 tables)

This paper contains 1 section, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Timeline of events and volume of TikTok posts.
  • Figure 2: Top Keywords within our query that appear in the 'video_description' attribute
  • Figure 3: Classification of languages used within the transcripts (log scale)
  • Figure 4: 200 of the most frequent bigrams in the English Transcripts
  • Figure 5: 200 of the most frequent bigrams in the Spanish Transcripts
  • ...and 1 more figures