Table of Contents
Fetching ...

Analyzing Misinformation Claims During the 2022 Brazilian General Election on WhatsApp, Twitter, and Kwai

Scott A. Hale, Adriano Belisario, Ahmed Mostafa, Chico Camargo

TL;DR

This study presents a cross-platform analysis of misinformation during Brazil’s 2022 general election, pooling WhatsApp tiplines and a TSE chatbot with Kwai and Twitter data. Using MPNet embeddings for text, PDQ hashing for images, and TMK embeddings for videos, the authors quantify cross-platform overlap and reveal pronounced platform-specific content patterns shaped by affordances. They find low overlap betweenWhatsApp tiplines and the TSE bot, and limited cross-platform matches between WhatsApp and Twitter, while WhatsApp and Kwai content largely remain platform-specific with many singleton clusters. The work highlights substantial methodological gaps in cross-platform claim matching and argues for platform-tailored monitoring approaches and data-sharing collaborations with fact-checkers to improve misinformation response in diverse digital ecosystems. The findings carry practical implications for election integrity, urging multi-platform strategies and advanced multimodal matching tools, particularly in contexts with encrypted messaging and emerging video platforms like Kwai.

Abstract

This study analyzes misinformation from WhatsApp, Twitter, and Kwai during the 2022 Brazilian general election. Given the democratic importance of accurate information during elections, multiple fact-checking organizations collaborated to identify and respond to misinformation via WhatsApp tiplines and power a fact-checking feature within a chatbot operated by Brazil's election authority, the TSE. WhatsApp is installed on over 99% of smartphones in Brazil, and the TSE chatbot was used by millions of citizens in the run-up to the elections. During the same period, we collected social media data from Twitter (now X) and Kwai (a popular video-sharing app similar to TikTok). Using the WhatsApp, Kwai, and Twitter data along with fact-checks from three Brazilian fact-checking organizations, we find unique claims on each platform. Even when the same claims are present on different platforms, they often differ in format, detail, length, or other characteristics. Our research highlights the limitations of current claim matching algorithms to match claims across platforms with such differences and identifies areas for further algorithmic development. Finally, we perform a descriptive analysis examining the formats (image, video, audio, text) and content themes of popular misinformation claims.

Analyzing Misinformation Claims During the 2022 Brazilian General Election on WhatsApp, Twitter, and Kwai

TL;DR

This study presents a cross-platform analysis of misinformation during Brazil’s 2022 general election, pooling WhatsApp tiplines and a TSE chatbot with Kwai and Twitter data. Using MPNet embeddings for text, PDQ hashing for images, and TMK embeddings for videos, the authors quantify cross-platform overlap and reveal pronounced platform-specific content patterns shaped by affordances. They find low overlap betweenWhatsApp tiplines and the TSE bot, and limited cross-platform matches between WhatsApp and Twitter, while WhatsApp and Kwai content largely remain platform-specific with many singleton clusters. The work highlights substantial methodological gaps in cross-platform claim matching and argues for platform-tailored monitoring approaches and data-sharing collaborations with fact-checkers to improve misinformation response in diverse digital ecosystems. The findings carry practical implications for election integrity, urging multi-platform strategies and advanced multimodal matching tools, particularly in contexts with encrypted messaging and emerging video platforms like Kwai.

Abstract

This study analyzes misinformation from WhatsApp, Twitter, and Kwai during the 2022 Brazilian general election. Given the democratic importance of accurate information during elections, multiple fact-checking organizations collaborated to identify and respond to misinformation via WhatsApp tiplines and power a fact-checking feature within a chatbot operated by Brazil's election authority, the TSE. WhatsApp is installed on over 99% of smartphones in Brazil, and the TSE chatbot was used by millions of citizens in the run-up to the elections. During the same period, we collected social media data from Twitter (now X) and Kwai (a popular video-sharing app similar to TikTok). Using the WhatsApp, Kwai, and Twitter data along with fact-checks from three Brazilian fact-checking organizations, we find unique claims on each platform. Even when the same claims are present on different platforms, they often differ in format, detail, length, or other characteristics. Our research highlights the limitations of current claim matching algorithms to match claims across platforms with such differences and identifies areas for further algorithmic development. Finally, we perform a descriptive analysis examining the formats (image, video, audio, text) and content themes of popular misinformation claims.
Paper Structure (27 sections, 7 figures, 2 tables)

This paper contains 27 sections, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Volume of filtered, election-related tweets during the study period. Only original tweets in Portuguese are included (i.e., retweets are excluded).
  • Figure 2: Left: The cumulative distribution function (CDF) comparing how many users (y-axis) submitted how many messages (x-axis) to the WhatsApp tiplines. While 54% of users sent only one message, the most prolific user sent 299. Right: The CDF comparing the number of clusters (y-axis) to their sizes (x-axis) on the WhatsApp tiplines. While 78% of clusters have only one item, the largest cluster has 858 instances of people thanking fact-checkers and the second largest has 210 instances of a video. Note that x-axes for both plots use log-10 scales.
  • Figure 3: Left: Number of tipline submissions per day. Right: Number of new clusters appearing per day in the fact-checkers' misinformation tiplines.
  • Figure 4: The relationship between the number of users and the amount of novel content (number of clusters) is mostly linear in the empirical data (solid blue line) and in random re-orders (pink dashed lines). The relationship is similar when consider all content (left) or only content that leads to a published fact-check (right).
  • Figure 5: Overlap between content submitted to the tiplines and TSE is low, but there is a weak, positive correlation between the number of times a video, image, or text item is sent to the TSE bot and the misinformation tiplines.
  • ...and 2 more figures