Table of Contents
Fetching ...

FineFake: A Knowledge-Enriched Dataset for Fine-Grained Multi-Domain Fake News Detection

Ziyi Zhou, Xiaoming Zhang, Litian Zhang, Jiacheng Liu, Senzhang Wang, Zheng Liu, Xi Zhang, Chaozhuo Li, Philip S. Yu

TL;DR

This work introduces a novel multi-domain knowledge-enhanced benchmark with fine-grained annotations, named FineFake, which encompasses 16,909 data samples spanning six semantic topics and eight platforms and proposes a knowledge-enhanced domain adaptation network.

Abstract

Existing benchmarks for fake news detection have significantly contributed to the advancement of models in assessing the authenticity of news content. However, these benchmarks typically focus solely on news pertaining to a single semantic topic or originating from a single platform, thereby failing to capture the diversity of multi-domain news in real scenarios. In order to understand fake news across various domains, the external knowledge and fine-grained annotations are indispensable to provide precise evidence and uncover the diverse underlying strategies for fabrication, which are also ignored by existing benchmarks. To address this gap, we introduce a novel multi-domain knowledge-enhanced benchmark with fine-grained annotations, named \textbf{FineFake}. FineFake encompasses 16,909 data samples spanning six semantic topics and eight platforms. Each news item is enriched with multi-modal content, potential social context, semi-manually verified common knowledge, and fine-grained annotations that surpass conventional binary labels. Furthermore, we formulate three challenging tasks based on FineFake and propose a knowledge-enhanced domain adaptation network. Extensive experiments are conducted on FineFake under various scenarios, providing accurate and reliable benchmarks for future endeavors. The entire FineFake project is publicly accessible as an open-source repository at \url{https://github.com/Accuser907/FineFake}.

FineFake: A Knowledge-Enriched Dataset for Fine-Grained Multi-Domain Fake News Detection

TL;DR

This work introduces a novel multi-domain knowledge-enhanced benchmark with fine-grained annotations, named FineFake, which encompasses 16,909 data samples spanning six semantic topics and eight platforms and proposes a knowledge-enhanced domain adaptation network.

Abstract

Existing benchmarks for fake news detection have significantly contributed to the advancement of models in assessing the authenticity of news content. However, these benchmarks typically focus solely on news pertaining to a single semantic topic or originating from a single platform, thereby failing to capture the diversity of multi-domain news in real scenarios. In order to understand fake news across various domains, the external knowledge and fine-grained annotations are indispensable to provide precise evidence and uncover the diverse underlying strategies for fabrication, which are also ignored by existing benchmarks. To address this gap, we introduce a novel multi-domain knowledge-enhanced benchmark with fine-grained annotations, named \textbf{FineFake}. FineFake encompasses 16,909 data samples spanning six semantic topics and eight platforms. Each news item is enriched with multi-modal content, potential social context, semi-manually verified common knowledge, and fine-grained annotations that surpass conventional binary labels. Furthermore, we formulate three challenging tasks based on FineFake and propose a knowledge-enhanced domain adaptation network. Extensive experiments are conducted on FineFake under various scenarios, providing accurate and reliable benchmarks for future endeavors. The entire FineFake project is publicly accessible as an open-source repository at \url{https://github.com/Accuser907/FineFake}.
Paper Structure (24 sections, 8 equations, 6 figures, 6 tables)

This paper contains 24 sections, 8 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: The proposed FineFake: a multi-domain dataset that encompasses instances from diverse platforms and topics. Each sample is associated with corresponding image, accurate knowledge and fine-grained label.
  • Figure 2: The Construction Process of FineFake. Snopes is used as the starting point for data collection and the external links within the claim explanations are leveraged as sources for multi-platform data collection. Platforms are categorized into official news platforms and social media platforms while FineFake also collects potential social network information from social media platforms. Finally, each piece of news undergoes semi-manual knowledge annotation and fine-grained annotation to ensure label accuracy.
  • Figure 3: Basic information and statistic analysis on FineFake.
  • Figure 4: The overview of KEAN model.
  • Figure 5: Hyper-parameter sensitivity analysis of $\alpha$ and $\beta$.
  • ...and 1 more figures