Table of Contents
Fetching ...

Unsupervised Social Bot Detection via Structural Information Theory

Hao Peng, Jingyun Zhang, Xiang Huang, Zhifeng Hao, Angsheng Li, Zhengtao Yu, Philip S. Yu

TL;DR

UnDBot tackles the challenge of unsupervised, interpretable social bot detection by anchoring the approach in structural information theory. It builds a multi-relational graph from behavioral similarity across three relations, optimizes a heterogeneous structural entropy to yield a two-dimensional encoding tree, and labels communities using a stationary distribution combined with cohesion, enabling bot-vs-human discrimination without labeled data. Across four real datasets, UnDBot demonstrates superior accuracy and interpretability, with ablation studies confirming the value of the multi-relational graph and entropy-based partitioning. The framework offers practical efficiency and transparent insights into community structure for robust social bot detection with potential applicability to broader networked detection tasks.

Abstract

Research on social bot detection plays a crucial role in maintaining the order and reliability of information dissemination while increasing trust in social interactions. The current mainstream social bot detection models rely on black-box neural network technology, e.g., Graph Neural Network, Transformer, etc., which lacks interpretability. In this work, we present UnDBot, a novel unsupervised, interpretable, yet effective and practical framework for detecting social bots. This framework is built upon structural information theory. We begin by designing three social relationship metrics that capture various aspects of social bot behaviors: Posting Type Distribution, Posting Influence, and Follow-to-follower Ratio. Three new relationships are utilized to construct a new, unified, and weighted social multi-relational graph, aiming to model the relevance of social user behaviors and discover long-distance correlations between users. Second, we introduce a novel method for optimizing heterogeneous structural entropy. This method involves the personalized aggregation of edge information from the social multi-relational graph to generate a two-dimensional encoding tree. The heterogeneous structural entropy facilitates decoding of the substantial structure of the social bots network and enables hierarchical clustering of social bots. Thirdly, a new community labeling method is presented to distinguish social bot communities by computing the user's stationary distribution, measuring user contributions to network structure, and counting the intensity of user aggregation within the community. Compared with ten representative social bot detection approaches, comprehensive experiments demonstrate the advantages of effectiveness and interpretability of UnDBot on four real social network datasets.

Unsupervised Social Bot Detection via Structural Information Theory

TL;DR

UnDBot tackles the challenge of unsupervised, interpretable social bot detection by anchoring the approach in structural information theory. It builds a multi-relational graph from behavioral similarity across three relations, optimizes a heterogeneous structural entropy to yield a two-dimensional encoding tree, and labels communities using a stationary distribution combined with cohesion, enabling bot-vs-human discrimination without labeled data. Across four real datasets, UnDBot demonstrates superior accuracy and interpretability, with ablation studies confirming the value of the multi-relational graph and entropy-based partitioning. The framework offers practical efficiency and transparent insights into community structure for robust social bot detection with potential applicability to broader networked detection tasks.

Abstract

Research on social bot detection plays a crucial role in maintaining the order and reliability of information dissemination while increasing trust in social interactions. The current mainstream social bot detection models rely on black-box neural network technology, e.g., Graph Neural Network, Transformer, etc., which lacks interpretability. In this work, we present UnDBot, a novel unsupervised, interpretable, yet effective and practical framework for detecting social bots. This framework is built upon structural information theory. We begin by designing three social relationship metrics that capture various aspects of social bot behaviors: Posting Type Distribution, Posting Influence, and Follow-to-follower Ratio. Three new relationships are utilized to construct a new, unified, and weighted social multi-relational graph, aiming to model the relevance of social user behaviors and discover long-distance correlations between users. Second, we introduce a novel method for optimizing heterogeneous structural entropy. This method involves the personalized aggregation of edge information from the social multi-relational graph to generate a two-dimensional encoding tree. The heterogeneous structural entropy facilitates decoding of the substantial structure of the social bots network and enables hierarchical clustering of social bots. Thirdly, a new community labeling method is presented to distinguish social bot communities by computing the user's stationary distribution, measuring user contributions to network structure, and counting the intensity of user aggregation within the community. Compared with ten representative social bot detection approaches, comprehensive experiments demonstrate the advantages of effectiveness and interpretability of UnDBot on four real social network datasets.
Paper Structure (29 sections, 22 equations, 12 figures, 10 tables, 1 algorithm)

This paper contains 29 sections, 22 equations, 12 figures, 10 tables, 1 algorithm.

Figures (12)

  • Figure 1: The overall framework of UnDBot .
  • Figure 2: Multi-relational graph in social bot detection.
  • Figure 3: The Structural Information Theory-Based Social Bot Detection.
  • Figure 4: Distribution of user embeddings generated by unsupervised graph learning models on the Cresci-2015 and Cresci-2017 datasets.
  • Figure 5: Distribution of user embeddings generated by unsupervised graph learning models on the Botwiki-2019 and Pronbots-2019 datasets.
  • ...and 7 more figures

Theorems & Definitions (2)

  • Definition 3.1
  • Definition 3.2