Table of Contents
Fetching ...

HyperGraphDis: Leveraging Hypergraphs for Contextual and Social-Based Disinformation Detection

Nikos Salamanos, Pantelitsa Leonidou, Nikolaos Laoutaris, Michael Sirivianos, Maria Aspri, Marius Paraschiv

TL;DR

HyperGraphDis tackles Twitter disinformation by encoding both social structure and cascade content within a hypergraph, enabling node-level cascade classification with a HypergraphConv network. It introduces a three-phase pipeline: (1) METIS-based partitioning of the user graph to form hyperedges; (2) enriched cascade features via augmented subgraphs and DeepWalk embeddings; (3) cascade classification using hypergraph convolution and dense layers. Across four datasets, including MM-COVID and Health-related FakeHealth data, it achieves state-of-the-art or near-state-of-the-art F1 scores, while delivering substantial training and inference speedups compared to baselines such as Meta-graph, HGFND, and Cluster-GCN. The approach demonstrates strong scalability and robustness across political and health misinformation scenarios, with explicit attention to dataset-specific structural characteristics and ethical data handling.

Abstract

In light of the growing impact of disinformation on social, economic, and political landscapes, accurate and efficient identification methods are increasingly critical. This paper introduces HyperGraphDis, a novel approach for detecting disinformation on Twitter that employs a hypergraph-based representation to capture (i) the intricate social structures arising from retweet cascades, (ii) relational features among users, and (iii) semantic and topical nuances. Evaluated on four Twitter datasets -- focusing on the 2016 U.S. Presidential election and the COVID-19 pandemic -- HyperGraphDis outperforms existing methods in both accuracy and computational efficiency, underscoring its effectiveness and scalability for tackling the challenges posed by disinformation dissemination. HyperGraphDis displays exceptional performance on a COVID-19-related dataset, achieving an impressive F1 score (weighted) of approximately 89.5%. This result represents a notable improvement of around 4% compared to the other state-of-the-art methods. Additionally, significant enhancements in computation time are observed for both model training and inference. In terms of model training, completion times are accelerated by a factor ranging from 2.3 to 7.6 compared to the second-best method across the four datasets. Similarly, during inference, computation times are 1.3 to 6.8 times faster than the state-of-the-art.

HyperGraphDis: Leveraging Hypergraphs for Contextual and Social-Based Disinformation Detection

TL;DR

HyperGraphDis tackles Twitter disinformation by encoding both social structure and cascade content within a hypergraph, enabling node-level cascade classification with a HypergraphConv network. It introduces a three-phase pipeline: (1) METIS-based partitioning of the user graph to form hyperedges; (2) enriched cascade features via augmented subgraphs and DeepWalk embeddings; (3) cascade classification using hypergraph convolution and dense layers. Across four datasets, including MM-COVID and Health-related FakeHealth data, it achieves state-of-the-art or near-state-of-the-art F1 scores, while delivering substantial training and inference speedups compared to baselines such as Meta-graph, HGFND, and Cluster-GCN. The approach demonstrates strong scalability and robustness across political and health misinformation scenarios, with explicit attention to dataset-specific structural characteristics and ethical data handling.

Abstract

In light of the growing impact of disinformation on social, economic, and political landscapes, accurate and efficient identification methods are increasingly critical. This paper introduces HyperGraphDis, a novel approach for detecting disinformation on Twitter that employs a hypergraph-based representation to capture (i) the intricate social structures arising from retweet cascades, (ii) relational features among users, and (iii) semantic and topical nuances. Evaluated on four Twitter datasets -- focusing on the 2016 U.S. Presidential election and the COVID-19 pandemic -- HyperGraphDis outperforms existing methods in both accuracy and computational efficiency, underscoring its effectiveness and scalability for tackling the challenges posed by disinformation dissemination. HyperGraphDis displays exceptional performance on a COVID-19-related dataset, achieving an impressive F1 score (weighted) of approximately 89.5%. This result represents a notable improvement of around 4% compared to the other state-of-the-art methods. Additionally, significant enhancements in computation time are observed for both model training and inference. In terms of model training, completion times are accelerated by a factor ranging from 2.3 to 7.6 compared to the second-best method across the four datasets. Similarly, during inference, computation times are 1.3 to 6.8 times faster than the state-of-the-art.
Paper Structure (20 sections, 4 figures, 4 tables)

This paper contains 20 sections, 4 figures, 4 tables.

Figures (4)

  • Figure 1: A toy example of the proposed hypergraph construction pipeline.
  • Figure 2: A toy example of a hypergraph node. Left-hand: the raw retweet data provided by Twitter API, always corresponds to a star-like graph with limited structural information. Right-hand: We enhance the raw data by appending the past interactions between the retweeters (i.e., the users) to the star-like graph. Finally, we compute the node embeddings (in this enhanced subgraph) using the DeepWalk algorithm. Bottom: The final feature vector consists of the DeepWalk embeddings together with users' features and tweet-text features (topics, sentiments, etc.)
  • Figure 3: Ablation analysis for HyperGraphDis and Meta-graph on MM-COVID and Health Release datasets. (a) & (b): The GNN layers refer to the HypergraphConv layers in HyperGraphDis and to the GCNConv layers in Meta-graph. (c) & (d): Train process time for the models evaluated in (a) and (b). Five trials across all experiments.
  • Figure 4: Further evaluation of HyperGraphDis versus Meta-graph using the optimal number of GNN layers based on the results in Figure \ref{['fig:ablation_1']}(a)&(b). Namely, we use one HypergraphConv layer in MM-COVID and two layers in Heath Release for the HyperGraphDis. For the Meta-graph, we use three and one GCNConv layer in MM-COVID and Health Release respectively.