Table of Contents
Fetching ...

SA-GNAS: Seed Architecture Expansion for Efficient Large-scale Graph Neural Architecture Search

Guanghui Zhu, Zipeng Ji, Jingyan Chen, Limin Wang, Chunfeng Yuan, Yihua Huang

TL;DR

This work tackles the challenge of performing Graph Neural Architecture Search (GNAS) on large-scale graphs where conventional NAS methods are computationally prohibitive. It introduces SA-GNAS, a seed-architecture expansion framework that first selects a promising seed architecture via performance ranking consistency across sampled subgraphs, and then iteratively expands the seed using entropy minimization guided by node/edge entropy metrics and localized differentiable search. The approach leverages GraphSAINT-based subgraph sampling and a two-stage process to maintain performance consistency when transferring from subgraphs to the full graph, achieving strong accuracy while dramatically reducing search time (e.g., up to 2.8x speedup on billion-edge graphs) and enabling effective parallelization. Extensive experiments on five large graphs show SA-GNAS outperforms hand-crafted GNNs and existing GNAS methods, with demonstrated compatibility with other training tricks and clear insights into architecture evolution. The method advances practical, scalable automated design of GNNs for industrial-scale graphs, offering a viable path toward efficient large-scale Graph NAS.

Abstract

GNAS (Graph Neural Architecture Search) has demonstrated great effectiveness in automatically designing the optimal graph neural architectures for multiple downstream tasks, such as node classification and link prediction. However, most existing GNAS methods cannot efficiently handle large-scale graphs containing more than million-scale nodes and edges due to the expensive computational and memory overhead. To scale GNAS on large graphs while achieving better performance, we propose SA-GNAS, a novel framework based on seed architecture expansion for efficient large-scale GNAS. Similar to the cell expansion in biotechnology, we first construct a seed architecture and then expand the seed architecture iteratively. Specifically, we first propose a performance ranking consistency-based seed architecture selection method, which selects the architecture searched on the subgraph that best matches the original large-scale graph. Then, we propose an entropy minimization-based seed architecture expansion method to further improve the performance of the seed architecture. Extensive experimental results on five large-scale graphs demonstrate that the proposed SA-GNAS outperforms human-designed state-of-the-art GNN architectures and existing graph NAS methods. Moreover, SA-GNAS can significantly reduce the search time, showing better search efficiency. For the largest graph with billion edges, SA-GNAS can achieve 2.8 times speedup compared to the SOTA large-scale GNAS method GAUSS. Additionally, since SA-GNAS is inherently parallelized, the search efficiency can be further improved with more GPUs. SA-GNAS is available at https://github.com/PasaLab/SAGNAS.

SA-GNAS: Seed Architecture Expansion for Efficient Large-scale Graph Neural Architecture Search

TL;DR

This work tackles the challenge of performing Graph Neural Architecture Search (GNAS) on large-scale graphs where conventional NAS methods are computationally prohibitive. It introduces SA-GNAS, a seed-architecture expansion framework that first selects a promising seed architecture via performance ranking consistency across sampled subgraphs, and then iteratively expands the seed using entropy minimization guided by node/edge entropy metrics and localized differentiable search. The approach leverages GraphSAINT-based subgraph sampling and a two-stage process to maintain performance consistency when transferring from subgraphs to the full graph, achieving strong accuracy while dramatically reducing search time (e.g., up to 2.8x speedup on billion-edge graphs) and enabling effective parallelization. Extensive experiments on five large graphs show SA-GNAS outperforms hand-crafted GNNs and existing GNAS methods, with demonstrated compatibility with other training tricks and clear insights into architecture evolution. The method advances practical, scalable automated design of GNNs for industrial-scale graphs, offering a viable path toward efficient large-scale Graph NAS.

Abstract

GNAS (Graph Neural Architecture Search) has demonstrated great effectiveness in automatically designing the optimal graph neural architectures for multiple downstream tasks, such as node classification and link prediction. However, most existing GNAS methods cannot efficiently handle large-scale graphs containing more than million-scale nodes and edges due to the expensive computational and memory overhead. To scale GNAS on large graphs while achieving better performance, we propose SA-GNAS, a novel framework based on seed architecture expansion for efficient large-scale GNAS. Similar to the cell expansion in biotechnology, we first construct a seed architecture and then expand the seed architecture iteratively. Specifically, we first propose a performance ranking consistency-based seed architecture selection method, which selects the architecture searched on the subgraph that best matches the original large-scale graph. Then, we propose an entropy minimization-based seed architecture expansion method to further improve the performance of the seed architecture. Extensive experimental results on five large-scale graphs demonstrate that the proposed SA-GNAS outperforms human-designed state-of-the-art GNN architectures and existing graph NAS methods. Moreover, SA-GNAS can significantly reduce the search time, showing better search efficiency. For the largest graph with billion edges, SA-GNAS can achieve 2.8 times speedup compared to the SOTA large-scale GNAS method GAUSS. Additionally, since SA-GNAS is inherently parallelized, the search efficiency can be further improved with more GPUs. SA-GNAS is available at https://github.com/PasaLab/SAGNAS.

Paper Structure

This paper contains 34 sections, 2 theorems, 14 equations, 10 figures, 11 tables, 2 algorithms.

Key Result

Proposition 1

graphsaint$\zeta_v^{l+1}$ is an unbiased estimator of the aggregation of $v \in V_s$ in the $(l+1)^{th}$ GCN layer, i.e., $\mathbb{E}{(\zeta_v^{l+1})} = \sum_{u \in V} \widetilde{A}_{u,v}(W^{l})^Th_{u}^{l}$, where $W^{l}$ and $h_{u}^{l}$ denote the weight and hidden embedding of the $l^{th}$ layer.

Figures (10)

  • Figure 1: Left: the cell-based search space design. Right: the cell architecture. Different edge colors represent various candidate aggregation operations. The search objective is to identify the optimal aggregation operation on each edge.
  • Figure 2: The overall search framework of SA-GNAS, which consists of two stages: a) performance ranking consistency-based seed architecture selection; b) Entropy minimization-based seed architecture expansion.
  • Figure 3: Performance ranking consistency-based subgraph matching.
  • Figure 4: The architecture expansion based on entropy minimization and localized differentiable architecture search.
  • Figure 5: Trajectories of the overall entropy of the cell architecture during the architecture expansion stage.
  • ...and 5 more figures

Theorems & Definitions (4)

  • Proposition 1
  • Definition 1
  • Definition 2
  • Proposition 2