Table of Contents
Fetching ...

Influence Maximization in Hypergraphs Using A Genetic Algorithm with New Initialization and Evaluation Methods

Xilong Qu, Wenbin Pei, Yingchao Yang, Xirong Xu, Renquan Zhang, Qiang Zhang

TL;DR

This work tackles influence maximization in hypergraphs, where high-order interactions are captured by hyperedges. It proposes a hypergraph independent cascade model and a genetic algorithm (G-CIIM) that uses hypergraph collective influence (HCI) for initialization, a joint node–hyperedge fitness function, and a mutation operator designed for overlap and collective effects. Empirical results on synthetic (ER, SF, K-UF) and real-world hypergraphs show that G-CIIM consistently outperforms eight baselines, with larger advantages as the seed set size grows, and ablation confirms the importance of initialization and mutation. The approach demonstrates the practicality of GA-based IM in complex hypergraph propagation and provides insights into how node–hyperedge coupling shapes influence spread, with potential for broader applicability and future optimization of dynamics and parameter sensitivity.

Abstract

Influence maximization (IM) is a crucial optimization task related to analyzing complex networks in the real world, such as social networks, disease propagation networks, and marketing networks. Publications to date about the IM problem focus mainly on graphs, which fail to capture high-order interaction relationships from the real world. Therefore, the use of hypergraphs for addressing the IM problem has been receiving increasing attention. However, identifying the most influential nodes in hypergraphs remains challenging, mainly because nodes and hyperedges are often strongly coupled and correlated. In this paper, to effectively identify the most influential nodes, we first propose a novel hypergraph-independent cascade model that integrates the influences of both node and hyperedge failures. Afterward, we introduce genetic algorithms (GA) to identify the most influential nodes that leverage hypergraph collective influences. In the GA-based method, the hypergraph collective influence is effectively used to initialize the population, thereby enhancing the quality of initial candidate solutions. The designed fitness function considers the joint influences of both nodes and hyperedges. This ensures the optimal set of nodes with the best influence on both nodes and hyperedges to be evaluated accurately. Moreover, a new mutation operator is designed by introducing factors, i.e., the collective influence and overlapping effects of nodes in hypergraphs, to breed high-quality offspring. In the experiments, several simulations on both synthetic and real hypergraphs have been conducted, and the results demonstrate that the proposed method outperforms the compared methods.

Influence Maximization in Hypergraphs Using A Genetic Algorithm with New Initialization and Evaluation Methods

TL;DR

This work tackles influence maximization in hypergraphs, where high-order interactions are captured by hyperedges. It proposes a hypergraph independent cascade model and a genetic algorithm (G-CIIM) that uses hypergraph collective influence (HCI) for initialization, a joint node–hyperedge fitness function, and a mutation operator designed for overlap and collective effects. Empirical results on synthetic (ER, SF, K-UF) and real-world hypergraphs show that G-CIIM consistently outperforms eight baselines, with larger advantages as the seed set size grows, and ablation confirms the importance of initialization and mutation. The approach demonstrates the practicality of GA-based IM in complex hypergraph propagation and provides insights into how node–hyperedge coupling shapes influence spread, with potential for broader applicability and future optimization of dynamics and parameter sensitivity.

Abstract

Influence maximization (IM) is a crucial optimization task related to analyzing complex networks in the real world, such as social networks, disease propagation networks, and marketing networks. Publications to date about the IM problem focus mainly on graphs, which fail to capture high-order interaction relationships from the real world. Therefore, the use of hypergraphs for addressing the IM problem has been receiving increasing attention. However, identifying the most influential nodes in hypergraphs remains challenging, mainly because nodes and hyperedges are often strongly coupled and correlated. In this paper, to effectively identify the most influential nodes, we first propose a novel hypergraph-independent cascade model that integrates the influences of both node and hyperedge failures. Afterward, we introduce genetic algorithms (GA) to identify the most influential nodes that leverage hypergraph collective influences. In the GA-based method, the hypergraph collective influence is effectively used to initialize the population, thereby enhancing the quality of initial candidate solutions. The designed fitness function considers the joint influences of both nodes and hyperedges. This ensures the optimal set of nodes with the best influence on both nodes and hyperedges to be evaluated accurately. Moreover, a new mutation operator is designed by introducing factors, i.e., the collective influence and overlapping effects of nodes in hypergraphs, to breed high-quality offspring. In the experiments, several simulations on both synthetic and real hypergraphs have been conducted, and the results demonstrate that the proposed method outperforms the compared methods.
Paper Structure (18 sections, 27 equations, 13 figures, 4 tables, 1 algorithm)

This paper contains 18 sections, 27 equations, 13 figures, 4 tables, 1 algorithm.

Figures (13)

  • Figure 1: An example of a hypergraph. (a) A hypergraph with $6$ nodes and $3$ hyperedges. (b) The corresponding incidence matrix for the hypergraph.
  • Figure 2: The overall framework diagram of the G-CIIM algorithm.
  • Figure 3: The image depicts the propagation rule of the IC Model in hypergraph. Circles represent nodes, while ovals represent hyperedges. Initially, node $3$ is designated as the seed node, and the propagation probabilities $t_{i e_\gamma}$ and $s_{e_\gamma i}$ are set to $0.5$. As the process begins, the failure of node $3$ leads to the failure of hyperedges $e_2$ and $e_3$. In the subsequent time step, the failure of hyperedge $e_2$ leads to the failure of nodes $4$ and $5$. The failure of node $5$ subsequently causes hyperedge $e_4$ to fail. Finally, the failure of hyperedge $e_4$ leads to the failure of node $7$. At this time, no nodes or hyperedges will fail, and then the propagation process terminates.
  • Figure 4: The illustrates of hypergraph collective influence. The green circles represent the nodes, the orange triangles represent the hyperedge, and the edges represent the relationship between the node and the hyperedge. (a) shows the 1-order hypergraph collective influence of node $i$ that can be delineated into two components. The first part, denoted as $k_i$, represents the hyperdegree of node $i$, indicating the number of hyperedges directly connected to node $i$. The second part represents the sum of the probabilities of node $i$ propagating to each associated hyperedge $e_\gamma$, multiplied by the cardinality of the hyperedge $e_\gamma$ minus one. (b) shows the 2-order hypergraph collective influence of node $i$ that also can be defined as the sum of two components. The first component is the 1-order hypergraph collective influence of node $i$. The second is the sum of the probabilities of node $i$ propagating to each associated hyperedge $e_\gamma$, multiplied by the probability of hyperedge $e_\gamma$ propagating to each associated node $j$, and further multiplied by the hyperdegree of node $j$ minus one, summed over all hyperedges $e_\gamma$ and nodes $j$, where $e_\gamma \in \partial i$, $j \in {e_\gamma }/i$.
  • Figure 5: The illustrates of the population initialization process. The population initialization is divided into two parts, part A is that the nodes with large HCI-ICM have more chances to be selected. Part B is the random selection strategy.
  • ...and 8 more figures