A (More) Realistic Evaluation Setup for Generalisation of Community Models on Malicious Content Detection

Ivo Verhoeven; Pushkar Mishra; Rahel Beloch; Helen Yannakoudakis; Ekaterina Shutova

A (More) Realistic Evaluation Setup for Generalisation of Community Models on Malicious Content Detection

Ivo Verhoeven, Pushkar Mishra, Rahel Beloch, Helen Yannakoudakis, Ekaterina Shutova

TL;DR

This paper targets the gap between static benchmark evaluations and the evolving nature of malicious content on social graphs, proposing a $k$-shot subgraph sampling setup to assess inductive generalisation to unseen graphs and domains. It combines gradient-based graph meta-learning (MAML/ProtoNet variants) with prototypical initialisation to enable rapid adaptation from local subgraphs, under realistic constraints such as limited labels and bounded social context. Empirical results show that standard, non-episodic community models struggle to generalise to new graphs like CoAID and TwitterHateSpeech, while meta-learned models with prototypical initialisation consistently improve transfer performance, especially at higher $k$-shot regimes. The work underscores the need for realistic benchmarks in malicious-content detection and provides open-source code to accelerate future research in graph-based generalisation.

Abstract

Community models for malicious content detection, which take into account the context from a social graph alongside the content itself, have shown remarkable performance on benchmark datasets. Yet, misinformation and hate speech continue to propagate on social media networks. This mismatch can be partially attributed to the limitations of current evaluation setups that neglect the rapid evolution of online content and the underlying social graph. In this paper, we propose a novel evaluation setup for model generalisation based on our few-shot subgraph sampling approach. This setup tests for generalisation through few labelled examples in local explorations of a larger graph, emulating more realistic application settings. We show this to be a challenging inductive setup, wherein strong performance on the training graph is not indicative of performance on unseen tasks, domains, or graph structures. Lastly, we show that graph meta-learners trained with our proposed few-shot subgraph sampling outperform standard community models in the inductive setup. We make our code publicly available.

A (More) Realistic Evaluation Setup for Generalisation of Community Models on Malicious Content Detection

TL;DR

This paper targets the gap between static benchmark evaluations and the evolving nature of malicious content on social graphs, proposing a

-shot subgraph sampling setup to assess inductive generalisation to unseen graphs and domains. It combines gradient-based graph meta-learning (MAML/ProtoNet variants) with prototypical initialisation to enable rapid adaptation from local subgraphs, under realistic constraints such as limited labels and bounded social context. Empirical results show that standard, non-episodic community models struggle to generalise to new graphs like CoAID and TwitterHateSpeech, while meta-learned models with prototypical initialisation consistently improve transfer performance, especially at higher

-shot regimes. The work underscores the need for realistic benchmarks in malicious-content detection and provides open-source code to accelerate future research in graph-based generalisation.

Abstract

Paper Structure (33 sections, 10 equations, 5 figures, 14 tables, 1 algorithm)

This paper contains 33 sections, 10 equations, 5 figures, 14 tables, 1 algorithm.

Introduction
Related Work
Community Models
Generalisable Content-only Models
Subgraph Sampling & Meta-learning
Datasets & Tasks
Methodology
Community Modelling
Few-shot Subgraph Sampling
Gradient-based Meta-learning
Prototypical Initialisation
Implementation Details
Experiments and Results
Experimental Setup
GossipCop Results
...and 18 more sections

Figures (5)

Figure 1: Support subgraph generation. Left: collect the $r$-radius neighbourhood of an anchor user. Middle: sub-sample using random walks from document nodes until reaching a maximum node count. Right: unmask document nodes inversely proportional to the number of subgraphs they appear in. Colours correspond to classes.
Figure 2: Generalisation of various models to CoAID and TwitterHateSpeech, in terms of MCC. See-through markers give the performance of each model instance, with error bars giving the 90% CI. Solid markers give the performance averaged across model instances. Markers are offset to avoid overlap. The horizontal axis gives the support graph $k$-shot. The dashed gray line for CoAID gives the zero-shot performance of the Subgraphs model, i.e., direct domain transfer. Colours and shape both denote a model instance.
Figure 3: GossipCop training losses for (t) maml-lh and (b) maml-rh. The left column gives loss of the models on the support and query sets. Support loss is computed prior to the first adaptation step (blue and dashed orange lines), query loss after the last adaptation step (pink and dashed green lines). The left column provides the support loss prior to the first adaptation step (blue line) and after the last adaptation step (orange line). Finally, the green line gives the relative improvement of the support set loss.
Figure 4: Kernel density estimates for the distribution of relative excess homophily (Equation \ref{['eq:rel_excess_homophily']}) for sampled subgraphs. The left column present user-centred sampled graphs, the right column gives the $r$-radius neighbourhoods about document nodes from the query graph. The different rows give different datasets. On the x-axis, 0 corresponds to a random graph, 1 to a perfectly homophilic graph. Values below 0 indicate heterophily.
Figure 5: TwitterHateSpeech results using only protonet at much larger values of $k$.

A (More) Realistic Evaluation Setup for Generalisation of Community Models on Malicious Content Detection

TL;DR

Abstract

A (More) Realistic Evaluation Setup for Generalisation of Community Models on Malicious Content Detection

Authors

TL;DR

Abstract

Table of Contents

Figures (5)