Table of Contents
Fetching ...

A Modality-Aware Cooperative Co-Evolutionary Framework for Multimodal Graph Neural Architecture Search

Sixuan Wang, Jiao Yin, Jinli Cao, Mingjian Tang, Yong-Feng Ge

TL;DR

The paper tackles vulnerability co-exploitation by automating the design of multimodal graph neural networks. It introduces MACC-MGNAS, a modality-aware cooperative co-evolutionary NAS framework that decomposes architecture search into modality-specific and fusion components, aided by the MADTS surrogate and SPDI diversity control. Empirical results on the VulCE dataset show that MACC-MGNAS reaches an F1 of 81.67% (best 84.04%) in about 3 GPU-hours, outperforming handcrafted MGNNs and state-of-the-art NAS methods while reducing computational cost. Ablation and convergence analyses highlight the importance of modality-aware coordination, efficient surrogate-guided search, and adaptive diversity in achieving fast, robust convergence. The architecture evolution analysis provides transferable design principles for MGNNs, such as multiplicative message interactions, increased hidden capacity, and normalized fusion, with practical implications for scalable multimodal graph learning.

Abstract

Co-exploitation attacks on software vulnerabilities pose severe risks to enterprises, a threat that can be mitigated by analyzing heterogeneous and multimodal vulnerability data. Multimodal graph neural networks (MGNNs) are well-suited to integrate complementary signals across modalities, thereby improving attack-prediction accuracy. However, designing an effective MGNN architecture is challenging because it requires coordinating modality-specific components at each layer, which is infeasible through manual tuning. Genetic algorithm (GA)-based graph neural architecture search (GNAS) provides a natural solution, yet existing methods are confined to single modalities and overlook modality heterogeneity. To address this limitation, we propose a modality-aware cooperative co-evolutionary algorithm for multimodal graph neural architecture search, termed MACC-MGNAS. First, we develop a modality-aware cooperative co-evolution (MACC) framework under a divide-and-conquer paradigm: a coordinator partitions a global chromosome population into modality-specific gene groups, local workers evolve them independently, and the coordinator reassembles chromosomes for joint evaluation. This framework effectively captures modality heterogeneity ignored by single-modality GNAS. Second, we introduce a modality-aware dual-track surrogate (MADTS) method to reduce evaluation cost and accelerate local gene evolution. Third, we design a similarity-based population diversity indicator (SPDI) strategy to adaptively balance exploration and exploitation, thereby accelerating convergence and avoiding local optima. On a standard vulnerabilities co-exploitation (VulCE) dataset, MACC-MGNAS achieves an F1-score of 81.67% within only 3 GPU-hours, outperforming the state-of-the-art competitor by 8.7% F1 while reducing computation cost by 27%.

A Modality-Aware Cooperative Co-Evolutionary Framework for Multimodal Graph Neural Architecture Search

TL;DR

The paper tackles vulnerability co-exploitation by automating the design of multimodal graph neural networks. It introduces MACC-MGNAS, a modality-aware cooperative co-evolutionary NAS framework that decomposes architecture search into modality-specific and fusion components, aided by the MADTS surrogate and SPDI diversity control. Empirical results on the VulCE dataset show that MACC-MGNAS reaches an F1 of 81.67% (best 84.04%) in about 3 GPU-hours, outperforming handcrafted MGNNs and state-of-the-art NAS methods while reducing computational cost. Ablation and convergence analyses highlight the importance of modality-aware coordination, efficient surrogate-guided search, and adaptive diversity in achieving fast, robust convergence. The architecture evolution analysis provides transferable design principles for MGNNs, such as multiplicative message interactions, increased hidden capacity, and normalized fusion, with practical implications for scalable multimodal graph learning.

Abstract

Co-exploitation attacks on software vulnerabilities pose severe risks to enterprises, a threat that can be mitigated by analyzing heterogeneous and multimodal vulnerability data. Multimodal graph neural networks (MGNNs) are well-suited to integrate complementary signals across modalities, thereby improving attack-prediction accuracy. However, designing an effective MGNN architecture is challenging because it requires coordinating modality-specific components at each layer, which is infeasible through manual tuning. Genetic algorithm (GA)-based graph neural architecture search (GNAS) provides a natural solution, yet existing methods are confined to single modalities and overlook modality heterogeneity. To address this limitation, we propose a modality-aware cooperative co-evolutionary algorithm for multimodal graph neural architecture search, termed MACC-MGNAS. First, we develop a modality-aware cooperative co-evolution (MACC) framework under a divide-and-conquer paradigm: a coordinator partitions a global chromosome population into modality-specific gene groups, local workers evolve them independently, and the coordinator reassembles chromosomes for joint evaluation. This framework effectively captures modality heterogeneity ignored by single-modality GNAS. Second, we introduce a modality-aware dual-track surrogate (MADTS) method to reduce evaluation cost and accelerate local gene evolution. Third, we design a similarity-based population diversity indicator (SPDI) strategy to adaptively balance exploration and exploitation, thereby accelerating convergence and avoiding local optima. On a standard vulnerabilities co-exploitation (VulCE) dataset, MACC-MGNAS achieves an F1-score of 81.67% within only 3 GPU-hours, outperforming the state-of-the-art competitor by 8.7% F1 while reducing computation cost by 27%.

Paper Structure

This paper contains 43 sections, 18 equations, 6 figures, 2 tables, 1 algorithm.

Figures (6)

  • Figure 1: Schematic structure of the MACC-MGNAS algorithm. A global coordinator interacts with multiple modality workers (mworkers) and one fusion worker (fworker). The coordinator decomposes chromosomes into modality-specific and fusion blocks, dispatches them to workers for local optimization, and then collects feedback to update the global population.
  • Figure 2: Illustration of the proposed MACC framework. Each chromosome is decomposed into modality-specific and fusion blocks (Eq. \ref{['eq:macc-decomposition']}), which are independently optimized by modality workers (mworkers) and a fusion worker (fworker). The elite blocks returned by workers are merged (Eq. \ref{['eq:macc-merge']}) at the Coordinator side to construct candidate chromosomes, which are evaluated at the global level (Eq. \ref{['eq:fitness-global']}).
  • Figure 3: Trade-off between efficiency and predictive accuracy across all compared methods. Each point denotes one independent run (10 runs per method). The $x$-axis reports GPU-hours for search and retraining, and the $y$-axis reports the corresponding test F1-score. Marker colors and shapes indicate different algorithms (see legend). MACC-MGNAS (purple crosses) consistently occupies the upper-left region, achieving higher F1 with lower GPU-hours than classical GNAS baselines (GA, PSO, EDA, BO) and multimodal frameworks (DC-NAS, MAGCN, MyGO, C2RS). Note that PSO attains the lowest cost but clusters at lower F1, illustrating a cost–accuracy trade-off. Overall, the distribution highlights the advantage of modality-aware decomposition in reducing cost without sacrificing accuracy.
  • Figure 4: Convergence comparison over 10 runs. MACC-MGNAS rises steeply early, reactivates exploration around Generation 17 through SPDI, and ultimately converges with the highest accuracy and lowest variance. The temporary drops in PSO and EDA curves are due to stochastic exploration generating weaker candidates before selection recovers. Other baselines either plateau early or converge more slowly.
  • Figure 5: Parallel coordinate plot of best architectures across 10 MACC-MGNAS runs. Lines denote architectures, with color encoding F1 scores. Consistent motifs such as multiplicative message operators and normalized fusion recur across trials, while residual diversity is preserved—evidence that SPDI consolidates effective patterns without collapsing exploration.
  • ...and 1 more figures