Supervised centrality via sparse network influence regression: an application to the 2021 Henan floods' social network
Yingying Ma, Wei Lan, Chenlei Leng, Ting Li, Hansheng Wang
TL;DR
The paper tackles the limitation of topology-only centrality by introducing supervised centrality via the sparse network influence regression (SNIR) model, which yields node-specific, sparse influence parameters $\rho_j$ in the network response equation $Y_i = \mu_i + \sum_j \rho_j a_{ij} Y_j + \varepsilon_i$. It develops a scalable estimation strategy based on conditional quasi-maximum likelihood and a forward-addition screening procedure with EBIC for model selection, establishing screening consistency and asymptotic normality on the selected set. Applied to the 2021 Henan Floods Sina Weibo data, SNIR identifies task-specific influential users for three response types (reposts, comments, likes) and shows improved fit over SAR while revealing influential actors not captured by purely topology- or response-based methods. Extensive simulations corroborate the method’s ability to recover true influencers in large, sparse networks and demonstrate computational efficiency, with potential extensions to dynamics and covariates. The work provides a robust, scalable framework for task-driven centrality that can guide information propagation and misinformation mitigation in large social networks.
Abstract
The social characteristics of players in a social network are closely associated with their network positions and relational importance. Identifying those influential players in a network is of great importance as it helps to understand how ties are formed, how information is propagated, and, in turn, can guide the dissemination of new information. Motivated by a Sina Weibo social network analysis of the 2021 Henan Floods, where response variables for each Sina Weibo user are available, we propose a new notion of supervised centrality that emphasizes the task-specific nature of a player's centrality. To estimate the supervised centrality and identify important players, we develop a novel sparse network influence regression by introducing individual heterogeneity for each user. To overcome the computational difficulties in fitting the model for large social networks, we further develop a forward-addition algorithm and show that it can consistently identify a superset of the influential Sina Weibo users. We apply our method to analyze three responses in the Henan Floods data: the number of comments, reposts, and likes, and obtain meaningful results. A further simulation study corroborates the developed method.
