Table of Contents
Fetching ...

A unified framework for identifying influential nodes in hypergraphs

Yajing Hao, Longzhao Liu, Xin Wang, Zhihao Han, Ming Wei, Zhiming Zheng, Shaoting Tang

TL;DR

The paper tackles the challenge of identifying influential nodes in hypergraphs by integrating propagation dynamics with higher-order topology. It proposes the Initial Propagation Score (IPS), a dynamics-aware centrality that uses early-time propagation information, with analytical forms such as $IPS_1^{HCP} = 1 + \lambda \sum_{h \in N_h^1(s)}(|h|-1)$, to predict long-term outbreak sizes. Across more than 20 real-world hypergraphs and multiple dynamics, IPS consistently outperforms traditional centralities, demonstrates robustness to parameter changes, and scales with local information, while also transferring across contagion models and even into opinion dynamics via the higher-order naming game. The framework offers interpretable, physically grounded insights and provides a principled basis for designing interventions in epidemiology, information diffusion, and collective decision-making, with potential extensions to physics-informed data-driven methods.

Abstract

Identifying influential nodes plays a pivotal role in understanding, controlling, and optimizing the behavior of complex systems, ranging from social to biological and technological domains. Yet most centrality-based approaches rely on pairwise topology and are purely structural, neglecting the higher-order interactions and the coupling between structure and dynamics. Consequently, the practical effectiveness of existing approaches remains uncertain when applied to complex spreading processes. To bridge this gap, we propose a unified framework, Initial Propagation Score (IPS), to directly embed propagation dynamics into influence assessment on higher-order networks. We analytically derive mechanism-aware influence measures by relating the early-stage dynamics and local topological characteristics to long-term outbreak sizes, and such explicit physical context endows IPS with robustness, transferability, and interpretability. Extensive experiments across multiple dynamics and more than 20 real-world hypergraphs show that IPS consistently outperforms other leading baseline centralities. Furthermore, IPS estimates node influence with only local neighborhood information, yielding computational efficiency and scalability to large-scale networks. This work underscores the necessity of considering dynamics for reliable identification of influential nodes and provides a concise principled basis for optimizing interventions in epidemiology, information diffusion, and collective intelligence.

A unified framework for identifying influential nodes in hypergraphs

TL;DR

The paper tackles the challenge of identifying influential nodes in hypergraphs by integrating propagation dynamics with higher-order topology. It proposes the Initial Propagation Score (IPS), a dynamics-aware centrality that uses early-time propagation information, with analytical forms such as , to predict long-term outbreak sizes. Across more than 20 real-world hypergraphs and multiple dynamics, IPS consistently outperforms traditional centralities, demonstrates robustness to parameter changes, and scales with local information, while also transferring across contagion models and even into opinion dynamics via the higher-order naming game. The framework offers interpretable, physically grounded insights and provides a principled basis for designing interventions in epidemiology, information diffusion, and collective decision-making, with potential extensions to physics-informed data-driven methods.

Abstract

Identifying influential nodes plays a pivotal role in understanding, controlling, and optimizing the behavior of complex systems, ranging from social to biological and technological domains. Yet most centrality-based approaches rely on pairwise topology and are purely structural, neglecting the higher-order interactions and the coupling between structure and dynamics. Consequently, the practical effectiveness of existing approaches remains uncertain when applied to complex spreading processes. To bridge this gap, we propose a unified framework, Initial Propagation Score (IPS), to directly embed propagation dynamics into influence assessment on higher-order networks. We analytically derive mechanism-aware influence measures by relating the early-stage dynamics and local topological characteristics to long-term outbreak sizes, and such explicit physical context endows IPS with robustness, transferability, and interpretability. Extensive experiments across multiple dynamics and more than 20 real-world hypergraphs show that IPS consistently outperforms other leading baseline centralities. Furthermore, IPS estimates node influence with only local neighborhood information, yielding computational efficiency and scalability to large-scale networks. This work underscores the necessity of considering dynamics for reliable identification of influential nodes and provides a concise principled basis for optimizing interventions in epidemiology, information diffusion, and collective intelligence.

Paper Structure

This paper contains 8 sections, 9 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Motivation and schematic of the IPS method.a-c: The top 25% of nodes Email-Enron are highlighted based on their normalized influence (using the bottom color map), under different spreading dynamics and parameter settings: a HCP model with parameter CP1, b HTC model with parameter TP1, c HTC model with parameter TP2. The remaining nodes are uniformly colored in dark blue, and the node size is proportional to its propagation range at the first time step. Results demonstrate that node influence is highly sensitive to the choice of spreading dynamics and parameter settings. d and e: Complete comparison of node influence under different dynamical configurations. Each bar contains all the nodes in Email-Enron, and nodes are colored by their normalized node influence under the given model and dynamical parameters. The arrangement order of nodes is uniform to the first bar (CP1 and TP1, respectively). Results show that dynamics affect node influential ranking, and the sensitivity to the parameter varies in different models. Node influence is averaged over 10,000 independent simulations, with dynamical parameters: CP1: $\nu=1$ and $\lambda=0.0115$, CP2: $\nu=2$ and $\lambda=0.0108$, CP3: $\nu=1$ and $\lambda=0.02$, TP1: $\theta=0.25$ and $\eta=0.05$, TP2: $\theta=0.5$ and $\eta=0.1$, TP3: $\theta=1/37$ and $\eta=0.027$, $\mu=1$ for all cases (see models and the meaning of dynamical parameters in Methods). f: The idea and schematic of the IPS method. The importance of a node is essentially its range of influence. Taking spreading dynamics as a paradigmatic example, the influence of a node $s$ can be represented by the expected number of infected individuals given $s$ as the initial seed, i.e., $\mathbb{E}(R_\infty \mid s)$. For convenience, suppose $\mu=1$. $\mathbb{E}(R_\infty \mid s)$ can be decomposed as the accumulation of $\mathbb{E}(I_t\mid s)$, and approximated by $\mathbb{E}(I_t+R_t\mid s)$ for $\mathbb{E}(I_\infty\mid s)=0$. The toy hypergraphs in this panel give an example of a propagation chain.
  • Figure 2: IPS successfully predicts node influence across diverse real-world hypergraphs. IPS is compared with 10 commonly used hypergraph centrality measures, evaluated by Kendall's $\tau$ in a-b and Jaccard coefficient $J(r)$ in c-j. a Each box summarizes Kendall's $\tau$ of a corresponding measure in 20 real hypergraphs. The solid and dashed lines in each box represent the median and the mean, respectively. b Values of Kendall's $\tau$ are plotted against the 20 hypergraphs, which are ordered by decreasing order of node count as follows: 1. congress-bills, 2. house-committees, 3. music-review, 4. M_PL_062_ins, 5. email-EU, 6. M_PL_015_ins, 7. Mid1, 8. geometry-questions, 9. M_PL_062_pl, 10. algebra-questions, 11. SFHH, 12. Elem1, 13. Thiers13, 14. senate-bills, 15. senate-committees, 16. LyonSchool, 17. InVS15, 18. email-Enron, 19. M_PL_015_pl, 20. LH10. Hyperstructure-based methods are connected by dashed lines for emphasis. Results show that IPS performs the best across almost all hypergraphs. c--j show the Jaccard coefficient $J(r)$ of methods, which scans the ability of each method to identify key nodes at different scales. IPS (the red line) is the best in almost all cases. c--i reports results of the highlighted hypergraphs in b, while panel j (threads-math-sx) reports results of an additional large-scale hypergraph with 152,702 nodes. All colors in this figure are painted according to the upper-right legend. Simulation settings are detailed in Methods. Parameter settings are provided in Supplementary Table S1.
  • Figure 3: Explanation for the variability of method performance. This plot shows the relationship between each method’s identification performance and its rank correlation with IPS. The clear positive correlation indicates that stronger alignment with IPS leads to better identification performance. The inset shows the case of threads-math-sx (overlapped points are circled for clarity).
  • Figure 4: IPS methods consistently outperform other benchmarks under different dynamical parameters. We report Kendall's $\tau$ between measures and ground-truth under the HCP model in different dynamical parameters. Each column of panels corresponds to the $\nu$ given in the headline; each row corresponds to the $\lambda$ given on the left side. Here, $\lambda$ takes a uniform multiple of $\lambda_c$, while $\lambda_c$ may be different in each hypergraph. Each panel contains the results in 20 real hypergraphs. The solid line gives the median, and the dashed line gives the mean. IPS methods achieve the highest values of Kendall's $\tau$ in all cases, indicating their outstanding performance and robustness. Simulation details are given in Methods, and the specific values of dynamical parameters for each hypergraph are provided in Supplementary Table S2.
  • Figure 5: IPS performs the best across various high-order contagion dynamics.a,h: First-order IPS under the HCSA and HTC models, respectively. Box plots provide Kendall's $\tau$ between the ground-truth influence ranking under the corresponding dynamics and the ranking induced by each measure, evaluated on multiple real-world hypergraphs. b--g aggregate results under HCSA. The column headers indicate $\nu$ and the rows indicate $\lambda$: b, c: $0.1\lambda_c$, d, e: $\lambda_c$, f, g: $10\lambda_c$. Panels i--n aggregate results over 11 real-world hypergraphs for HTC; column headers indicate $\theta$ and rows indicate $\eta$: i, j: $0.1\eta_c$, k, l: $\eta_c$, m, n: $10\eta_c$ (or $2\eta_c$ instead if $10\eta_c \geq1$). In all panels, $\mu=1$. The specific dynamical parameters are given in Supplementary Table S5 and Table S7. See Methods for simulation settings.
  • ...and 1 more figures