Table of Contents
Fetching ...

Gradient Rewiring for Editable Graph Neural Network Training

Zhimeng Jiang, Zirui Liu, Xiaotian Han, Qizhang Feng, Hongye Jin, Qiaoyu Tan, Kaixiong Zhou, Na Zou, Xia Hu

TL;DR

A simple yet effective GRE method for graph neural network training, named GRE, which first store the anchor gradient of the loss on training nodes to preserve the locality and rewire the gradient of the loss on the target node to preserve performance on the training node using anchor gradient.

Abstract

Deep neural networks are ubiquitously adopted in many applications, such as computer vision, natural language processing, and graph analytics. However, well-trained neural networks can make prediction errors after deployment as the world changes. \textit{Model editing} involves updating the base model to correct prediction errors with less accessible training data and computational resources. Despite recent advances in model editors in computer vision and natural language processing, editable training in graph neural networks (GNNs) is rarely explored. The challenge with editable GNN training lies in the inherent information aggregation across neighbors, which can lead model editors to affect the predictions of other nodes unintentionally. In this paper, we first observe the gradient of cross-entropy loss for the target node and training nodes with significant inconsistency, which indicates that directly fine-tuning the base model using the loss on the target node deteriorates the performance on training nodes. Motivated by the gradient inconsistency observation, we propose a simple yet effective \underline{G}radient \underline{R}ewiring method for \underline{E}ditable graph neural network training, named \textbf{GRE}. Specifically, we first store the anchor gradient of the loss on training nodes to preserve the locality. Subsequently, we rewire the gradient of the loss on the target node to preserve performance on the training node using anchor gradient. Experiments demonstrate the effectiveness of GRE on various model architectures and graph datasets in terms of multiple editing situations. The source code is available at \url{https://github.com/zhimengj0326/Gradient_rewiring_editing}

Gradient Rewiring for Editable Graph Neural Network Training

TL;DR

A simple yet effective GRE method for graph neural network training, named GRE, which first store the anchor gradient of the loss on training nodes to preserve the locality and rewire the gradient of the loss on the target node to preserve performance on the training node using anchor gradient.

Abstract

Deep neural networks are ubiquitously adopted in many applications, such as computer vision, natural language processing, and graph analytics. However, well-trained neural networks can make prediction errors after deployment as the world changes. \textit{Model editing} involves updating the base model to correct prediction errors with less accessible training data and computational resources. Despite recent advances in model editors in computer vision and natural language processing, editable training in graph neural networks (GNNs) is rarely explored. The challenge with editable GNN training lies in the inherent information aggregation across neighbors, which can lead model editors to affect the predictions of other nodes unintentionally. In this paper, we first observe the gradient of cross-entropy loss for the target node and training nodes with significant inconsistency, which indicates that directly fine-tuning the base model using the loss on the target node deteriorates the performance on training nodes. Motivated by the gradient inconsistency observation, we propose a simple yet effective \underline{G}radient \underline{R}ewiring method for \underline{E}ditable graph neural network training, named \textbf{GRE}. Specifically, we first store the anchor gradient of the loss on training nodes to preserve the locality. Subsequently, we rewire the gradient of the loss on the target node to preserve performance on the training node using anchor gradient. Experiments demonstrate the effectiveness of GRE on various model architectures and graph datasets in terms of multiple editing situations. The source code is available at \url{https://github.com/zhimengj0326/Gradient_rewiring_editing}

Paper Structure

This paper contains 38 sections, 10 equations, 7 figures, 8 tables, 2 algorithms.

Figures (7)

  • Figure 1: (a) Top: RMSE distance between the gradients of cross-entropy loss over training datasets and over the targeted sample for different architectures. (b) Middle: Cross-entropy loss over training datasets when the model is updated using target loss. (c) Bottom: Cross-entropy loss over the targeted sample when the model is updated using target loss.
  • Figure 2: The test accuracy dropdown in sequential editing setting for GCN and GraphSAGE on various datasets. The units for y-axis are percentages ($\%$).
  • Figure 3: The success rate and test accuracy dropdown tradeoff in independent editing setting for GCN and GraphSAGE on various datasets. The trade-off curve close to the top left corner means better trade-off performance. The units for x- and y-axis are percentages ($\%$).
  • Figure 4: The hyperparameter study on test accuracy dropdown in independent editing setting w.r.t. $\lambda$.
  • Figure 5: The success rate and test accuracy dropdown tradeoff in independent editing setting for EGNN-GCN and EGNN-SAGE on various datasets. The trade-off curve close to the top left corner means better trade-off performance. The units for the x- and y-axis are percentages ($\%$).
  • ...and 2 more figures