Table of Contents
Fetching ...

Enriching GNNs with Text Contextual Representations for Detecting Disinformation Campaigns on Social Media

Bruno Croso Cunha da Silva, Thomas Palmeira Ferraz, Roseli De Deus Lopes

TL;DR

It is demonstrated that contextual text representations enhance GNN performance, achieving 33.8% relative improvement in Macro F1 over models without textual features and 9.3% over static text representations.

Abstract

Disinformation on social media poses both societal and technical challenges, requiring robust detection systems. While previous studies have integrated textual information into propagation networks, they have yet to fully leverage the advancements in Transformer-based language models for high-quality contextual text representations. This work addresses this gap by incorporating Transformer-based textual features into Graph Neural Networks (GNNs) for fake news detection. We demonstrate that contextual text representations enhance GNN performance, achieving 33.8% relative improvement in Macro F1 over models without textual features and 9.3% over static text representations. We further investigate the impact of different feature sources and the effects of noisy data augmentation. We expect our methodology to open avenues for further research, and we made code publicly available.

Enriching GNNs with Text Contextual Representations for Detecting Disinformation Campaigns on Social Media

TL;DR

It is demonstrated that contextual text representations enhance GNN performance, achieving 33.8% relative improvement in Macro F1 over models without textual features and 9.3% over static text representations.

Abstract

Disinformation on social media poses both societal and technical challenges, requiring robust detection systems. While previous studies have integrated textual information into propagation networks, they have yet to fully leverage the advancements in Transformer-based language models for high-quality contextual text representations. This work addresses this gap by incorporating Transformer-based textual features into Graph Neural Networks (GNNs) for fake news detection. We demonstrate that contextual text representations enhance GNN performance, achieving 33.8% relative improvement in Macro F1 over models without textual features and 9.3% over static text representations. We further investigate the impact of different feature sources and the effects of noisy data augmentation. We expect our methodology to open avenues for further research, and we made code publicly available.

Paper Structure

This paper contains 18 sections, 3 figures, 2 tables.

Figures (3)

  • Figure 1: The pipeline of our Text-Enriched GNNs starts with propagation graphs where the initial node represents a news article, and subsequent nodes form merged diffusion trees of root tweets, retweets, and replies. The dataset is oversampled to address class imbalance. Node features are enriched with textual embeddings from user profiles and retweets using BERTweet, with optional noise augmentation via NEFTune. Message-passing layers with pooling aggregate the nodes into graph-level a representation for producing a classification about the news article.
  • Figure 2: F1 Macro, ROC AUC and AUC PR as functions of Retweets, Profiles, Embedder, and NEFTune Alpha.
  • Figure 3: F1 Macro, ROC AUC and AUC PR standard deviations as functions of Retweets, Profiles, Embedder, and NEFTune Alpha.