Table of Contents
Fetching ...

IOHunter: Graph Foundation Model to Uncover Online Information Operations

Marco Minici, Luca Luceri, Francesco Fabbri, Emilio Ferrara

TL;DR

This work introduces a methodology designed to identify users orchestrating information operations, a.k.a. IO drivers, across various influence campaigns, and achieves state-of-the-art performance across multiple sets of IOs originating from six countries, significantly surpassing existing approaches.

Abstract

Social media platforms have become vital spaces for public discourse, serving as modern agoràs where a wide range of voices influence societal narratives. However, their open nature also makes them vulnerable to exploitation by malicious actors, including state-sponsored entities, who can conduct information operations (IOs) to manipulate public opinion. The spread of misinformation, false news, and misleading claims threatens democratic processes and societal cohesion, making it crucial to develop methods for the timely detection of inauthentic activity to protect the integrity of online discourse. In this work, we introduce a methodology designed to identify users orchestrating information operations, a.k.a. IO drivers, across various influence campaigns. Our framework, named IOHunter, leverages the combined strengths of Language Models and Graph Neural Networks to improve generalization in supervised, scarcely-supervised, and cross-IO contexts. Our approach achieves state-of-the-art performance across multiple sets of IOs originating from six countries, significantly surpassing existing approaches. This research marks a step toward developing Graph Foundation Models specifically tailored for the task of IO detection on social media platforms.

IOHunter: Graph Foundation Model to Uncover Online Information Operations

TL;DR

This work introduces a methodology designed to identify users orchestrating information operations, a.k.a. IO drivers, across various influence campaigns, and achieves state-of-the-art performance across multiple sets of IOs originating from six countries, significantly surpassing existing approaches.

Abstract

Social media platforms have become vital spaces for public discourse, serving as modern agoràs where a wide range of voices influence societal narratives. However, their open nature also makes them vulnerable to exploitation by malicious actors, including state-sponsored entities, who can conduct information operations (IOs) to manipulate public opinion. The spread of misinformation, false news, and misleading claims threatens democratic processes and societal cohesion, making it crucial to develop methods for the timely detection of inauthentic activity to protect the integrity of online discourse. In this work, we introduce a methodology designed to identify users orchestrating information operations, a.k.a. IO drivers, across various influence campaigns. Our framework, named IOHunter, leverages the combined strengths of Language Models and Graph Neural Networks to improve generalization in supervised, scarcely-supervised, and cross-IO contexts. Our approach achieves state-of-the-art performance across multiple sets of IOs originating from six countries, significantly surpassing existing approaches. This research marks a step toward developing Graph Foundation Models specifically tailored for the task of IO detection on social media platforms.

Paper Structure

This paper contains 14 sections, 4 equations, 2 figures, 4 tables, 1 algorithm.

Figures (2)

  • Figure 1: Illustration summarizing IOHunter. We feed the user posts to a multi-lingual SBert and then average all post embeddings to obtain a unique textual representation. SBert is frozen and not optimized. We also extract a degree-based embedding from the fused similarity network. Both modality embeddings are blended through a cross-attention layer and a couple of fully connected layers after concatenation. The obtained multi-modal representation is then fed to a GNN model that determines whether the user is an IO Driver or a legitimate user. Both the multi-modal embedding module and the GNN model are optimized during the training phase.
  • Figure 2: Performance of IOHunter and Node2Vec+RF with different amount of training data sparsity.