Table of Contents
Fetching ...

A Semi-supervised Fake News Detection using Sentiment Encoding and LSTM with Self-Attention

Pouya Shaeri, Ali Katanforoush

TL;DR

This work tackles fake news detection under label-scarce conditions by introducing a semi-supervised, sentiment-encoded framework that leverages pretrained sentiment analyses and an LSTM with self-attention. It employs a pseudo-labeling strategy with confidence thresholds and a fold-based training scheme to avoid data leakage, integrating sentiment-encoded features into an embedding → LSTM self-attention → dense architecture. Evaluated on the FakeNewsNet dataset, the approach achieves higher precision, recall, and F1 than strong baselines and prior semi-supervised methods, demonstrating robustness to limited labeled data. The findings suggest that sentiment-aware transfers combined with self-attentive sequence modeling can significantly enhance real-world fake news detection in social media contexts.

Abstract

Micro-blogs and cyber-space social networks are the main communication mediums to receive and share news nowadays. As a side effect, however, the networks can disseminate fake news that harms individuals and the society. Several methods have been developed to detect fake news, but the majority require large sets of manually labeled data to attain the application-level accuracy. Due to the strict privacy policies, the required data are often inaccessible or limited to some specific topics. On the other side, quite diverse and abundant unlabeled data on social media suggests that with a few labeled data, the problem of detecting fake news could be tackled via semi-supervised learning. Here, we propose a semi-supervised self-learning method in which a sentiment analysis is acquired by some state-of-the-art pretrained models. Our learning model is trained in a semi-supervised fashion and incorporates LSTM with self-attention layers. We benchmark our model on a dataset with 20,000 news content along with their feedback, which shows better performance in precision, recall, and measures compared to competitive methods in fake news detection.

A Semi-supervised Fake News Detection using Sentiment Encoding and LSTM with Self-Attention

TL;DR

This work tackles fake news detection under label-scarce conditions by introducing a semi-supervised, sentiment-encoded framework that leverages pretrained sentiment analyses and an LSTM with self-attention. It employs a pseudo-labeling strategy with confidence thresholds and a fold-based training scheme to avoid data leakage, integrating sentiment-encoded features into an embedding → LSTM self-attention → dense architecture. Evaluated on the FakeNewsNet dataset, the approach achieves higher precision, recall, and F1 than strong baselines and prior semi-supervised methods, demonstrating robustness to limited labeled data. The findings suggest that sentiment-aware transfers combined with self-attentive sequence modeling can significantly enhance real-world fake news detection in social media contexts.

Abstract

Micro-blogs and cyber-space social networks are the main communication mediums to receive and share news nowadays. As a side effect, however, the networks can disseminate fake news that harms individuals and the society. Several methods have been developed to detect fake news, but the majority require large sets of manually labeled data to attain the application-level accuracy. Due to the strict privacy policies, the required data are often inaccessible or limited to some specific topics. On the other side, quite diverse and abundant unlabeled data on social media suggests that with a few labeled data, the problem of detecting fake news could be tackled via semi-supervised learning. Here, we propose a semi-supervised self-learning method in which a sentiment analysis is acquired by some state-of-the-art pretrained models. Our learning model is trained in a semi-supervised fashion and incorporates LSTM with self-attention layers. We benchmark our model on a dataset with 20,000 news content along with their feedback, which shows better performance in precision, recall, and measures compared to competitive methods in fake news detection.
Paper Structure (12 sections, 3 equations, 6 figures, 4 tables)

This paper contains 12 sections, 3 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Folding the Dataset
  • Figure 2: Fold 2 pseudo-labeled
  • Figure 3: All folds pseudo-labeled
  • Figure 4: Proposed Deep Neural Network Architecture
  • Figure 5: LSTM with Self-Attention
  • ...and 1 more figures