Table of Contents
Fetching ...

GNNMerge: Merging of GNN Models Without Accessing Training Data

Vipul Garg, Ishita Thakre, Sayan Ranu

TL;DR

GnnMerge introduces a data-free model-merging approach for Graph Neural Networks by aligning node embeddings rather than merging parameters directly. By relaxing the joint, layer-coupled objective to a layer-wise independent formulation, it derives an analytical solution for common Mpnn architectures, enabling fast, scalable merging. Empirical results across multiple datasets, architectures, and tasks demonstrate up to ~$24\%$ accuracy gains over existing baselines and speedups up to ~$136\times$ compared to retraining from scratch. This work offers a practical path to continual, multi-task, and privacy-preserving graph learning where training data cannot be shared or reused.

Abstract

Model merging has gained prominence in machine learning as a method to integrate multiple trained models into a single model without accessing the original training data. While existing approaches have demonstrated success in domains such as computer vision and NLP, their application to Graph Neural Networks (GNNs) remains unexplored. These methods often rely on the assumption of shared initialization, which is seldom applicable to GNNs. In this work, we undertake the first benchmarking study of model merging algorithms for GNNs, revealing their limited effectiveness in this context. To address these challenges, we propose GNNMerge, which utilizes a task-agnostic node embedding alignment strategy to merge GNNs. Furthermore, we establish that under a mild relaxation, the proposed optimization objective admits direct analytical solutions for widely used GNN architectures, significantly enhancing its computational efficiency. Empirical evaluations across diverse datasets, tasks, and architectures establish GNNMerge to be up to 24% more accurate than existing methods while delivering over 2 orders of magnitude speed-up compared to training from scratch.

GNNMerge: Merging of GNN Models Without Accessing Training Data

TL;DR

GnnMerge introduces a data-free model-merging approach for Graph Neural Networks by aligning node embeddings rather than merging parameters directly. By relaxing the joint, layer-coupled objective to a layer-wise independent formulation, it derives an analytical solution for common Mpnn architectures, enabling fast, scalable merging. Empirical results across multiple datasets, architectures, and tasks demonstrate up to ~ accuracy gains over existing baselines and speedups up to ~ compared to retraining from scratch. This work offers a practical path to continual, multi-task, and privacy-preserving graph learning where training data cannot be shared or reused.

Abstract

Model merging has gained prominence in machine learning as a method to integrate multiple trained models into a single model without accessing the original training data. While existing approaches have demonstrated success in domains such as computer vision and NLP, their application to Graph Neural Networks (GNNs) remains unexplored. These methods often rely on the assumption of shared initialization, which is seldom applicable to GNNs. In this work, we undertake the first benchmarking study of model merging algorithms for GNNs, revealing their limited effectiveness in this context. To address these challenges, we propose GNNMerge, which utilizes a task-agnostic node embedding alignment strategy to merge GNNs. Furthermore, we establish that under a mild relaxation, the proposed optimization objective admits direct analytical solutions for widely used GNN architectures, significantly enhancing its computational efficiency. Empirical evaluations across diverse datasets, tasks, and architectures establish GNNMerge to be up to 24% more accurate than existing methods while delivering over 2 orders of magnitude speed-up compared to training from scratch.

Paper Structure

This paper contains 36 sections, 28 equations, 9 figures, 12 tables.

Figures (9)

  • Figure 1: A visual depiction of the alignment objective in GnnMerge. The yellow and orange ellipses represent the regions where the highlighted nodes receive the correct prediction. GnnMerge aims to embed the nodes closer to their original embeddings, increasing the likelihood that the new embeddings fall within the ellipses. As stated in Prob. \ref{['prob:merge']}, the merging graph(s) need not be the training graph or rely on supervision labels. While we assume a common graph for aligning base models, task-specific graphs can be used if needed.
  • Figure 2: Variation of performance of the two objective functions as the number of Gcn layers is changed for the arxiv dataset.
  • Figure 3: Visual Illustration of embedding alignment using GnnMerge and WAvg.
  • Figure 4: Variation of average accuracy of merging methods as the number of models varies.
  • Figure 5: Visual Illustration of embedding alignment using GnnMerge and WAvg. as the merging methods.
  • ...and 4 more figures

Theorems & Definitions (2)

  • Definition 1: Graph
  • Definition 2: Learning a task