Table of Contents
Fetching ...

Supervised centrality via sparse network influence regression: an application to the 2021 Henan floods' social network

Yingying Ma, Wei Lan, Chenlei Leng, Ting Li, Hansheng Wang

TL;DR

The paper tackles the limitation of topology-only centrality by introducing supervised centrality via the sparse network influence regression (SNIR) model, which yields node-specific, sparse influence parameters $\rho_j$ in the network response equation $Y_i = \mu_i + \sum_j \rho_j a_{ij} Y_j + \varepsilon_i$. It develops a scalable estimation strategy based on conditional quasi-maximum likelihood and a forward-addition screening procedure with EBIC for model selection, establishing screening consistency and asymptotic normality on the selected set. Applied to the 2021 Henan Floods Sina Weibo data, SNIR identifies task-specific influential users for three response types (reposts, comments, likes) and shows improved fit over SAR while revealing influential actors not captured by purely topology- or response-based methods. Extensive simulations corroborate the method’s ability to recover true influencers in large, sparse networks and demonstrate computational efficiency, with potential extensions to dynamics and covariates. The work provides a robust, scalable framework for task-driven centrality that can guide information propagation and misinformation mitigation in large social networks.

Abstract

The social characteristics of players in a social network are closely associated with their network positions and relational importance. Identifying those influential players in a network is of great importance as it helps to understand how ties are formed, how information is propagated, and, in turn, can guide the dissemination of new information. Motivated by a Sina Weibo social network analysis of the 2021 Henan Floods, where response variables for each Sina Weibo user are available, we propose a new notion of supervised centrality that emphasizes the task-specific nature of a player's centrality. To estimate the supervised centrality and identify important players, we develop a novel sparse network influence regression by introducing individual heterogeneity for each user. To overcome the computational difficulties in fitting the model for large social networks, we further develop a forward-addition algorithm and show that it can consistently identify a superset of the influential Sina Weibo users. We apply our method to analyze three responses in the Henan Floods data: the number of comments, reposts, and likes, and obtain meaningful results. A further simulation study corroborates the developed method.

Supervised centrality via sparse network influence regression: an application to the 2021 Henan floods' social network

TL;DR

The paper tackles the limitation of topology-only centrality by introducing supervised centrality via the sparse network influence regression (SNIR) model, which yields node-specific, sparse influence parameters in the network response equation . It develops a scalable estimation strategy based on conditional quasi-maximum likelihood and a forward-addition screening procedure with EBIC for model selection, establishing screening consistency and asymptotic normality on the selected set. Applied to the 2021 Henan Floods Sina Weibo data, SNIR identifies task-specific influential users for three response types (reposts, comments, likes) and shows improved fit over SAR while revealing influential actors not captured by purely topology- or response-based methods. Extensive simulations corroborate the method’s ability to recover true influencers in large, sparse networks and demonstrate computational efficiency, with potential extensions to dynamics and covariates. The work provides a robust, scalable framework for task-driven centrality that can guide information propagation and misinformation mitigation in large social networks.

Abstract

The social characteristics of players in a social network are closely associated with their network positions and relational importance. Identifying those influential players in a network is of great importance as it helps to understand how ties are formed, how information is propagated, and, in turn, can guide the dissemination of new information. Motivated by a Sina Weibo social network analysis of the 2021 Henan Floods, where response variables for each Sina Weibo user are available, we propose a new notion of supervised centrality that emphasizes the task-specific nature of a player's centrality. To estimate the supervised centrality and identify important players, we develop a novel sparse network influence regression by introducing individual heterogeneity for each user. To overcome the computational difficulties in fitting the model for large social networks, we further develop a forward-addition algorithm and show that it can consistently identify a superset of the influential Sina Weibo users. We apply our method to analyze three responses in the Henan Floods data: the number of comments, reposts, and likes, and obtain meaningful results. A further simulation study corroborates the developed method.

Paper Structure

This paper contains 22 sections, 4 theorems, 60 equations, 5 figures, 12 tables, 1 algorithm.

Key Result

Theorem 3.1

Under Conditions (C1)--(C6) discussed in Appendix sec:condition, as $N \rightarrow \infty$, with probability tending to one, (1) the forward-addition algorithm finds a superset of $\mathcal{S}_1$ in $m^*$ steps. That is, $P( \mathcal{S}_1 \subset \mathcal{S}^{(m^*)}) \rightarrow 1$; (2) we have $P(

Figures (5)

  • Figure 1: The histogram of in-degrees (left) and out-degrees (right) for Henan Floods network.
  • Figure 2: Density plot of reposts, comments and likes after logarithm transformation.
  • Figure 3: A hypothetical network for SNIR model. The size of each user is proportional to its in-degree. Black users are influential and white users are non-influential.
  • Figure 4: Empirical distributions of the three responses.
  • Figure 5: Proportion of correctly detected influential users as the signal-to-noise ratio (SNR) increases for various response types.

Theorems & Definitions (5)

  • Remark 1
  • Theorem 3.1
  • Corollary 1
  • Theorem E.1
  • Theorem E.2