Table of Contents
Fetching ...

Identifying Influential Brokers on Social Media from Social Network Structure

Sho Tsugawa, Kohei Watabe

TL;DR

Identifying influential brokers who spread others' messages, in addition to traditional source spreaders, is important for understanding large-scale diffusion. The authors compare brokers, source spreaders, and centrality-based nodes across three datasets and test centrality and node-embedding features, using DeepGL to derive interpretable embeddings. They find that brokers and source spreaders are largely distinct and poorly captured by single centrality measures, while DeepGL embeddings enable broker prediction with $F_1$ scores in the range $0.35$–$0.68$, outperforming centrality-only baselines; but cross-domain transfer remains challenging, and higher accuracy is needed for practical use. The results highlight the value of network topology and learned representations for broker identification, with domain-specific models and potential future integration of diffusion-history features suggested for improved performance. Overall, the work advances understanding of broker roles in information diffusion and provides a foundation for targeted diffusion control and marketing strategies.

Abstract

Identifying influencers in a given social network has become an important research problem for various applications, including accelerating the spread of information in viral marketing and preventing the spread of fake news and rumors. The literature contains a rich body of studies on identifying influential source spreaders who can spread their own messages to many other nodes. In contrast, the identification of influential brokers who can spread other nodes' messages to many nodes has not been fully explored. Theoretical and empirical studies suggest that involvement of both influential source spreaders and brokers is a key to facilitating large-scale information diffusion cascades. Therefore, this paper explores ways to identify influential brokers from a given social network. By using three social media datasets, we investigate the characteristics of influential brokers by comparing them with influential source spreaders and central nodes obtained from centrality measures. Our results show that (i) most of the influential source spreaders are not influential brokers (and vice versa) and (ii) the overlap between central nodes and influential brokers is small (less than 15%) in Twitter datasets. We also tackle the problem of identifying influential brokers from centrality measures and node embeddings, and we examine the effectiveness of social network features in the broker identification task. Our results show that (iii) although a single centrality measure cannot characterize influential brokers well, prediction models using node embedding features achieve F$_1$ scores of 0.35--0.68, suggesting the effectiveness of social network features for identifying influential brokers.

Identifying Influential Brokers on Social Media from Social Network Structure

TL;DR

Identifying influential brokers who spread others' messages, in addition to traditional source spreaders, is important for understanding large-scale diffusion. The authors compare brokers, source spreaders, and centrality-based nodes across three datasets and test centrality and node-embedding features, using DeepGL to derive interpretable embeddings. They find that brokers and source spreaders are largely distinct and poorly captured by single centrality measures, while DeepGL embeddings enable broker prediction with scores in the range , outperforming centrality-only baselines; but cross-domain transfer remains challenging, and higher accuracy is needed for practical use. The results highlight the value of network topology and learned representations for broker identification, with domain-specific models and potential future integration of diffusion-history features suggested for improved performance. Overall, the work advances understanding of broker roles in information diffusion and provides a foundation for targeted diffusion control and marketing strategies.

Abstract

Identifying influencers in a given social network has become an important research problem for various applications, including accelerating the spread of information in viral marketing and preventing the spread of fake news and rumors. The literature contains a rich body of studies on identifying influential source spreaders who can spread their own messages to many other nodes. In contrast, the identification of influential brokers who can spread other nodes' messages to many nodes has not been fully explored. Theoretical and empirical studies suggest that involvement of both influential source spreaders and brokers is a key to facilitating large-scale information diffusion cascades. Therefore, this paper explores ways to identify influential brokers from a given social network. By using three social media datasets, we investigate the characteristics of influential brokers by comparing them with influential source spreaders and central nodes obtained from centrality measures. Our results show that (i) most of the influential source spreaders are not influential brokers (and vice versa) and (ii) the overlap between central nodes and influential brokers is small (less than 15%) in Twitter datasets. We also tackle the problem of identifying influential brokers from centrality measures and node embeddings, and we examine the effectiveness of social network features in the broker identification task. Our results show that (iii) although a single centrality measure cannot characterize influential brokers well, prediction models using node embedding features achieve F scores of 0.35--0.68, suggesting the effectiveness of social network features for identifying influential brokers.
Paper Structure (26 sections, 3 equations, 4 figures, 8 tables)

This paper contains 26 sections, 3 equations, 4 figures, 8 tables.

Figures (4)

  • Figure 1: Source spreaders and brokers. Influential source spreaders are users who can spread their own messages to many other users, and influential brokers are users who can spread other users' messages to many users.
  • Figure 2: Confusion matrices for influencers and central nodes. Overlap scores between source spreaders and brokers are low. The overlaps between central nodes and brokers are low for all datasets, but the overlaps with source spreaders are relatively high.
  • Figure 3: Comparison of average broker score per retweet and the number of retweets posted by users among influential source spreaders, influential brokers, and all users. A tweet retweeted by an influential broker tends to spread more widely than does a tweet retweeted by an influential source spreader.
  • Figure 4: Effects of amount of training data when predicting top-10% influencers. Using 5% training data achieves F$_1$ scores that are similar to those with 50% training data.