Table of Contents
Fetching ...

BotUmc: An Uncertainty-Aware Twitter Bot Detection with Multi-view Causal Inference

Tao Yang, Yang Hu, Feihong Lu, Ziwei Zhang, Qingyun Sun, Jianxin Li

TL;DR

BotUmc tackles the problem of low-confidence Twitter bot detection by introducing an uncertainty-aware framework that combines LLM-based knowledge reasoning, interventional multi-view graph learning, and DST-inspired uncertainty quantification. It constructs heterogeneous graphs from text, metadata, and relations, and uses causal interventions to generate diverse environments, enabling robust feature learning with two RGCN views. Uncertainty is quantified through Dirichlet evidence and fused to select the most credible prediction across views, achieving superior performance on Cresci-15, TwiBot-20, and TwiBot-22. The approach improves reliability in bot detection and provides a principled mechanism to reject dubious decisions, with potential for broader multimodal and cross-platform applications.

Abstract

Social bots have become widely known by users of social platforms. To prevent social bots from spreading harmful speech, many novel bot detections are proposed. However, with the evolution of social bots, detection methods struggle to give high-confidence answers for samples. This motivates us to quantify the uncertainty of the outputs, informing the confidence of the results. Therefore, we propose an uncertainty-aware bot detection method to inform the confidence and use the uncertainty score to pick a high-confidence decision from multiple views of a social network under different environments. Specifically, our proposed BotUmc uses LLM to extract information from tweets. Then, we construct a graph based on the extracted information, the original user information, and the user relationship and generate multiple views of the graph by causal interference. Lastly, an uncertainty loss is used to force the model to quantify the uncertainty of results and select the result with low uncertainty in one view as the final decision. Extensive experiments show the superiority of our method.

BotUmc: An Uncertainty-Aware Twitter Bot Detection with Multi-view Causal Inference

TL;DR

BotUmc tackles the problem of low-confidence Twitter bot detection by introducing an uncertainty-aware framework that combines LLM-based knowledge reasoning, interventional multi-view graph learning, and DST-inspired uncertainty quantification. It constructs heterogeneous graphs from text, metadata, and relations, and uses causal interventions to generate diverse environments, enabling robust feature learning with two RGCN views. Uncertainty is quantified through Dirichlet evidence and fused to select the most credible prediction across views, achieving superior performance on Cresci-15, TwiBot-20, and TwiBot-22. The approach improves reliability in bot detection and provides a principled mechanism to reject dubious decisions, with potential for broader multimodal and cross-platform applications.

Abstract

Social bots have become widely known by users of social platforms. To prevent social bots from spreading harmful speech, many novel bot detections are proposed. However, with the evolution of social bots, detection methods struggle to give high-confidence answers for samples. This motivates us to quantify the uncertainty of the outputs, informing the confidence of the results. Therefore, we propose an uncertainty-aware bot detection method to inform the confidence and use the uncertainty score to pick a high-confidence decision from multiple views of a social network under different environments. Specifically, our proposed BotUmc uses LLM to extract information from tweets. Then, we construct a graph based on the extracted information, the original user information, and the user relationship and generate multiple views of the graph by causal interference. Lastly, an uncertainty loss is used to force the model to quantify the uncertainty of results and select the result with low uncertainty in one view as the final decision. Extensive experiments show the superiority of our method.

Paper Structure

This paper contains 15 sections, 14 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: (a) Comparison between previous Twitter bot detection and our uncertainty-aware bot detection. (b) Relationship between the accuracy of bot detection and the uncertainty of results. The bins indicate the count of users whose results are within a certain range of uncertainty. The lines show the corresponding performance.
  • Figure 2: The overview of our proposed BotUmc. It jointly utilizes multiple types of user information: Text, Metadata, and Topology information to detect bots. Twitter users’ tweets are first processed by the LLMs module, then encoded with other user information, and then processed by the causal interference module. Finally, the uncertainty module is used to integrate Twitter users under multiple views to classify them.
  • Figure 3: The causal structure.
  • Figure 4: F1 Scores and Accuracies of our proposed BotUmc with different values of the hyperparameters $\lambda_1$ and $\lambda_2$ on Twibot-20. Both the ranges are 0.1 to 0.9, with an interval of 0.1
  • Figure 5: Case study: the text, metadata, and graph information of a hidden bot account, as well as key information extracted from tweets and uncertainty scores of model outputs under different environments.
  • ...and 1 more figures