Table of Contents
Fetching ...

MegaFake: A Theory-Driven Dataset of Fake News Generated by Large Language Models

Lionel Z. Wang, Yiming Ma, Renfei Gao, Beichen Guo, Han Zhu, Wenqi Fan, Zexin Lu, Ka Chung Ng

TL;DR

This study analyzes the creation of fake news from a social psychology perspective and develops a comprehensive LLM-based theoretical framework, LLM-Fake Theory, which introduces a novel pipeline that automates the generation of fake news using LLMs, thereby eliminating the need for manual annotation.

Abstract

The advent of large language models (LLMs) has revolutionized online content creation, making it much easier to generate high-quality fake news. This misuse threatens the integrity of our digital environment and ethical standards. Therefore, understanding the motivations and mechanisms behind LLM-generated fake news is crucial. In this study, we analyze the creation of fake news from a social psychology perspective and develop a comprehensive LLM-based theoretical framework, LLM-Fake Theory. We introduce a novel pipeline that automates the generation of fake news using LLMs, thereby eliminating the need for manual annotation. Utilizing this pipeline, we create a theoretically informed Machine-generated Fake news dataset, MegaFake, derived from the GossipCop dataset. We conduct comprehensive analyses to evaluate our MegaFake dataset. We believe that our dataset and insights will provide valuable contributions to future research focused on the detection and governance of fake news in the era of LLMs.

MegaFake: A Theory-Driven Dataset of Fake News Generated by Large Language Models

TL;DR

This study analyzes the creation of fake news from a social psychology perspective and develops a comprehensive LLM-based theoretical framework, LLM-Fake Theory, which introduces a novel pipeline that automates the generation of fake news using LLMs, thereby eliminating the need for manual annotation.

Abstract

The advent of large language models (LLMs) has revolutionized online content creation, making it much easier to generate high-quality fake news. This misuse threatens the integrity of our digital environment and ethical standards. Therefore, understanding the motivations and mechanisms behind LLM-generated fake news is crucial. In this study, we analyze the creation of fake news from a social psychology perspective and develop a comprehensive LLM-based theoretical framework, LLM-Fake Theory. We introduce a novel pipeline that automates the generation of fake news using LLMs, thereby eliminating the need for manual annotation. Utilizing this pipeline, we create a theoretically informed Machine-generated Fake news dataset, MegaFake, derived from the GossipCop dataset. We conduct comprehensive analyses to evaluate our MegaFake dataset. We believe that our dataset and insights will provide valuable contributions to future research focused on the detection and governance of fake news in the era of LLMs.
Paper Structure (50 sections, 37 figures, 8 tables)

This paper contains 50 sections, 37 figures, 8 tables.

Figures (37)

  • Figure 1: Results for Different Topic Numbers (Information Blending)
  • Figure 2: Document Matching Process (Information Blending)
  • Figure 3: Document Matching Process (News Summarization)
  • Figure 4: Results for Different Topic Numbers (News Summarization)
  • Figure 5: Results for Different Temperatures (Writing Enhancement)
  • ...and 32 more figures