Table of Contents
Fetching ...

Fake News Detection on Social Media: A Data Mining Perspective

Kai Shu, Amy Sliva, Suhang Wang, Jiliang Tang, Huan Liu

TL;DR

This survey defines fake news narrowly as intentionally and verifiably false news articles and analyzes the unique challenges of detecting such content on social media. It proposes a two-phase data-mining framework combining news-content and social-context features with distinct models (knowledge-based and style-based for content; stance-based and propagation-based for social context) to improve detection. The paper reviews datasets and evaluation metrics, discusses related areas like rumor classification and truth discovery, and outlines open issues and future directions—emphasizing data quality, early detection, and interdisciplinary approaches. Overall, the work highlights how integrating psychological insights with data-driven social-context analysis can mitigate the spread and impact of misinformation on online platforms.

Abstract

Social media for news consumption is a double-edged sword. On the one hand, its low cost, easy access, and rapid dissemination of information lead people to seek out and consume news from social media. On the other hand, it enables the wide spread of "fake news", i.e., low quality news with intentionally false information. The extensive spread of fake news has the potential for extremely negative impacts on individuals and society. Therefore, fake news detection on social media has recently become an emerging research that is attracting tremendous attention. Fake news detection on social media presents unique characteristics and challenges that make existing detection algorithms from traditional news media ineffective or not applicable. First, fake news is intentionally written to mislead readers to believe false information, which makes it difficult and nontrivial to detect based on news content; therefore, we need to include auxiliary information, such as user social engagements on social media, to help make a determination. Second, exploiting this auxiliary information is challenging in and of itself as users' social engagements with fake news produce data that is big, incomplete, unstructured, and noisy. Because the issue of fake news detection on social media is both challenging and relevant, we conducted this survey to further facilitate research on the problem. In this survey, we present a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets. We also discuss related research areas, open problems, and future research directions for fake news detection on social media.

Fake News Detection on Social Media: A Data Mining Perspective

TL;DR

This survey defines fake news narrowly as intentionally and verifiably false news articles and analyzes the unique challenges of detecting such content on social media. It proposes a two-phase data-mining framework combining news-content and social-context features with distinct models (knowledge-based and style-based for content; stance-based and propagation-based for social context) to improve detection. The paper reviews datasets and evaluation metrics, discusses related areas like rumor classification and truth discovery, and outlines open issues and future directions—emphasizing data quality, early detection, and interdisciplinary approaches. Overall, the work highlights how integrating psychological insights with data-driven social-context analysis can mitigate the spread and impact of misinformation on online platforms.

Abstract

Social media for news consumption is a double-edged sword. On the one hand, its low cost, easy access, and rapid dissemination of information lead people to seek out and consume news from social media. On the other hand, it enables the wide spread of "fake news", i.e., low quality news with intentionally false information. The extensive spread of fake news has the potential for extremely negative impacts on individuals and society. Therefore, fake news detection on social media has recently become an emerging research that is attracting tremendous attention. Fake news detection on social media presents unique characteristics and challenges that make existing detection algorithms from traditional news media ineffective or not applicable. First, fake news is intentionally written to mislead readers to believe false information, which makes it difficult and nontrivial to detect based on news content; therefore, we need to include auxiliary information, such as user social engagements on social media, to help make a determination. Second, exploiting this auxiliary information is challenging in and of itself as users' social engagements with fake news produce data that is big, incomplete, unstructured, and noisy. Because the issue of fake news detection on social media is both challenging and relevant, we conducted this survey to further facilitate research on the problem. In this survey, we present a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets. We also discuss related research areas, open problems, and future research directions for fake news detection on social media.

Paper Structure

This paper contains 24 sections, 4 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: Fake news on social media: from characterization to detection.
  • Figure 2: Future directions and open issues for fake news detection on social media.