Table of Contents
Fetching ...

FIT-GNN: Faster Inference Time for GNNs that 'FIT' in Memory Using Coarsening

Shubhajit Roy, Hrriday Ruparel, Kishan Ved, Anirban Dasgupta

TL;DR

This work tackles the bottleneck of GNN inference on large graphs by using graph coarsening to split the input into subgraphs and augmenting them with Extra Nodes or Cluster Nodes to mitigate boundary information loss. The method, FIT-GNN, enables training and inference on subgraphs, yielding up to 100× faster single-node inference and substantial memory savings while preserving competitive accuracy across node- and graph-level tasks on diverse benchmarks. The authors provide a theoretical time/space complexity framework and validate the approach with extensive experiments on 13 real-world datasets, showing practical scalability where traditional full-graph inference is infeasible. Overall, FIT-GNN offers a scalable, memory-efficient pathway for deploying GNNs on large-scale graphs with minimal performance degradation.

Abstract

Scalability of Graph Neural Networks (GNNs) remains a significant challenge. To tackle this, methods like coarsening, condensation, and computation trees are used to train on a smaller graph, resulting in faster computation. Nonetheless, prior research has not adequately addressed the computational costs during the inference phase. This paper presents a novel approach to improve the scalability of GNNs by reducing computational burden during the inference phase using graph coarsening. We demonstrate two different methods -- Extra Nodes and Cluster Nodes. Our study extends the application of graph coarsening for graph-level tasks, including graph classification and graph regression. We conduct extensive experiments on multiple benchmark datasets to evaluate the performance of our approach. Our results show that the proposed method achieves orders of magnitude improvements in single-node inference time compared to traditional approaches. Furthermore, it significantly reduces memory consumption for node and graph classification and regression tasks, enabling efficient training and inference on low-resource devices where conventional methods are impractical. Notably, these computational advantages are achieved while maintaining competitive performance relative to baseline models.

FIT-GNN: Faster Inference Time for GNNs that 'FIT' in Memory Using Coarsening

TL;DR

This work tackles the bottleneck of GNN inference on large graphs by using graph coarsening to split the input into subgraphs and augmenting them with Extra Nodes or Cluster Nodes to mitigate boundary information loss. The method, FIT-GNN, enables training and inference on subgraphs, yielding up to 100× faster single-node inference and substantial memory savings while preserving competitive accuracy across node- and graph-level tasks on diverse benchmarks. The authors provide a theoretical time/space complexity framework and validate the approach with extensive experiments on 13 real-world datasets, showing practical scalability where traditional full-graph inference is infeasible. Overall, FIT-GNN offers a scalable, memory-efficient pathway for deploying GNNs on large-scale graphs with minimal performance degradation.

Abstract

Scalability of Graph Neural Networks (GNNs) remains a significant challenge. To tackle this, methods like coarsening, condensation, and computation trees are used to train on a smaller graph, resulting in faster computation. Nonetheless, prior research has not adequately addressed the computational costs during the inference phase. This paper presents a novel approach to improve the scalability of GNNs by reducing computational burden during the inference phase using graph coarsening. We demonstrate two different methods -- Extra Nodes and Cluster Nodes. Our study extends the application of graph coarsening for graph-level tasks, including graph classification and graph regression. We conduct extensive experiments on multiple benchmark datasets to evaluate the performance of our approach. Our results show that the proposed method achieves orders of magnitude improvements in single-node inference time compared to traditional approaches. Furthermore, it significantly reduces memory consumption for node and graph classification and regression tasks, enabling efficient training and inference on low-resource devices where conventional methods are impractical. Notably, these computational advantages are achieved while maintaining competitive performance relative to baseline models.

Paper Structure

This paper contains 23 sections, 3 theorems, 12 equations, 6 figures, 12 tables, 5 algorithms.

Key Result

Lemma 4.1

Models with $1$ layer of GNN cannot distinguish between $G$ and $\mathcal{G}_s$ when Extra Nodes method is used. (Proof in Appendix app:proof_lemma_extra_node_good)

Figures (6)

  • Figure 1: The Figure shows the overall pipeline of our proposed method and its comparison with traditional training and inference of GNNs. This pipeline is made for node-level tasks.
  • Figure 2: Figure showing the comparison between the Extra Node Method and the Cluster Node Method of appending additional nodes in $G_1, G_2, G_3$.
  • Figure 3: GPU memory consumption in MegaBytes (MB) (log scale) for FIT-GNN (Cluster Node) at different reduction ratios $r$ and Baseline, during inference. variation_neighborhoods coarsening algorithm is used.
  • Figure 4: The figure shows the feasibility of our inference methods for different node-level datasets against coarsening ratios. Two important inference setups for each dataset: single-node inference with the computational cost $\mathcal{O}\left(\max_{i}(\bar{n_i}^2d + \bar{n_i}d^2) + n\right)$, and full-graph inference with computational cost $\mathcal{O}\left(\sum_{i=1}^{k}(\bar{n_i}^2d + \bar{n_i}d^2)\right)$, are compared against the baseline (classical GNN) computational cost of $\mathcal{O}\left(n^2d + nd^2\right)$. The coarsening method used is variation_neighborhoods. Both axes are in logarithmic scale (base 10).
  • Figure 5: The plot shows the ablation study conducted on the Cora dataset to determine which experimental setup performs better than the others. The plot also compares different methods of appending nodes to the subgraphs and how performance changes with varying coarsening ratios. variation_neighborhoods coarsening algorithm is used.
  • ...and 1 more figures

Theorems & Definitions (6)

  • Lemma 4.1
  • Lemma 4.2
  • proof
  • Corollary 4.3
  • proof
  • proof