Table of Contents
Fetching ...

Detection of Human and Machine-Authored Fake News in Urdu

Muhammad Zain Ali, Yuxia Wang, Bernhard Pfahringer, Tony Smith

TL;DR

This work updated detection schema to include machine-generated news with focus on the Urdu language, and proposes a hierarchical detection strategy to improve the accuracy and robustness.

Abstract

The rise of social media has amplified the spread of fake news, now further complicated by large language models (LLMs) like ChatGPT, which ease the generation of highly convincing, error-free misinformation, making it increasingly challenging for the public to discern truth from falsehood. Traditional fake news detection methods relying on linguistic cues also becomes less effective. Moreover, current detectors primarily focus on binary classification and English texts, often overlooking the distinction between machine-generated true vs. fake news and the detection in low-resource languages. To this end, we updated detection schema to include machine-generated news with focus on the Urdu language. We further propose a hierarchical detection strategy to improve the accuracy and robustness. Experiments show its effectiveness across four datasets in various settings.

Detection of Human and Machine-Authored Fake News in Urdu

TL;DR

This work updated detection schema to include machine-generated news with focus on the Urdu language, and proposes a hierarchical detection strategy to improve the accuracy and robustness.

Abstract

The rise of social media has amplified the spread of fake news, now further complicated by large language models (LLMs) like ChatGPT, which ease the generation of highly convincing, error-free misinformation, making it increasingly challenging for the public to discern truth from falsehood. Traditional fake news detection methods relying on linguistic cues also becomes less effective. Moreover, current detectors primarily focus on binary classification and English texts, often overlooking the distinction between machine-generated true vs. fake news and the detection in low-resource languages. To this end, we updated detection schema to include machine-generated news with focus on the Urdu language. We further propose a hierarchical detection strategy to improve the accuracy and robustness. Experiments show its effectiveness across four datasets in various settings.

Paper Structure

This paper contains 34 sections, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Machine generated News collection process
  • Figure 2: Proposed Hierarchical Fake News Detection Architecture
  • Figure 3: Cross-domain evaluation results in terms of Accuracy
  • Figure 4: Confusion matrix of testing on long datasets using model trained on dataset1. Left: Test Split Dataset 3 (Long) and Right: Test Split Dataset 4 (Long)
  • Figure 5: Confusion matrix of testing on dataset 1 using model trained Left: Train Split Dataset 3 (Long) and Right: Train Split Dataset 4 (Long)