Table of Contents
Fetching ...

Fast Inference of Removal-Based Node Influence

Weikai Li, Zhiping Xiao, Xiao Luo, Yizhou Sun

TL;DR

The paper tackles the problem of measuring task-specific node influence under node removal in graphs by using a trained GNN as a surrogate and defining influence via the total variation distance $F_{g_\theta}(v_r) = \sum_{i \neq r} || g_\theta(G)_i - g_\theta(G_{-v_r})_i ||_1$. It introduces NORA, a gradient-based approximation that decomposes the influence into three components (embedding disappearance, aggregation change, and multi-hop spread) and computes a unified estimate with one forward and one backward pass, achieving $O(LNh^2+LMh)$ time. Extensive experiments on six datasets and six GNN models show that NORA consistently outperforms adapted baselines (node-mask and prediction-based) in correlating with real influence while drastically reducing runtime, with case studies on large graphs like ogbn-arxiv validating its practical relevance. The work provides a scalable, model-agnostic approach to influence estimation, enabling applications in marketing, information diffusion, and network robustness, and opens avenues for improved approximations and broader perturbation scenarios. $F_{g_\theta}(v_r)$, complexity comparisons, and the three-term decomposition are central to the method's effectiveness and efficiency.

Abstract

Graph neural networks (GNNs) are widely utilized to capture the information spreading patterns in graphs. While remarkable performance has been achieved, there is a new trending topic of evaluating node influence. We propose a new method of evaluating node influence, which measures the prediction change of a trained GNN model caused by removing a node. A real-world application is, "In the task of predicting Twitter accounts' polarity, had a particular account been removed, how would others' polarity change?". We use the GNN as a surrogate model whose prediction could simulate the change of nodes or edges caused by node removal. Our target is to obtain the influence score for every node, and a straightforward way is to alternately remove every node and apply the trained GNN on the modified graph to generate new predictions. It is reliable but time-consuming, so we need an efficient method. The related lines of work, such as graph adversarial attack and counterfactual explanation, cannot directly satisfy our needs, since their problem settings are different. We propose an efficient, intuitive, and effective method, NOde-Removal-based fAst GNN inference (NORA), which uses the gradient information to approximate the node-removal influence. It only costs one forward propagation and one backpropagation to approximate the influence score for all nodes. Extensive experiments on six datasets and six GNN models verify the effectiveness of NORA. Our code is available at https://github.com/weikai-li/NORA.git.

Fast Inference of Removal-Based Node Influence

TL;DR

The paper tackles the problem of measuring task-specific node influence under node removal in graphs by using a trained GNN as a surrogate and defining influence via the total variation distance . It introduces NORA, a gradient-based approximation that decomposes the influence into three components (embedding disappearance, aggregation change, and multi-hop spread) and computes a unified estimate with one forward and one backward pass, achieving time. Extensive experiments on six datasets and six GNN models show that NORA consistently outperforms adapted baselines (node-mask and prediction-based) in correlating with real influence while drastically reducing runtime, with case studies on large graphs like ogbn-arxiv validating its practical relevance. The work provides a scalable, model-agnostic approach to influence estimation, enabling applications in marketing, information diffusion, and network robustness, and opens avenues for improved approximations and broader perturbation scenarios. , complexity comparisons, and the three-term decomposition are central to the method's effectiveness and efficiency.

Abstract

Graph neural networks (GNNs) are widely utilized to capture the information spreading patterns in graphs. While remarkable performance has been achieved, there is a new trending topic of evaluating node influence. We propose a new method of evaluating node influence, which measures the prediction change of a trained GNN model caused by removing a node. A real-world application is, "In the task of predicting Twitter accounts' polarity, had a particular account been removed, how would others' polarity change?". We use the GNN as a surrogate model whose prediction could simulate the change of nodes or edges caused by node removal. Our target is to obtain the influence score for every node, and a straightforward way is to alternately remove every node and apply the trained GNN on the modified graph to generate new predictions. It is reliable but time-consuming, so we need an efficient method. The related lines of work, such as graph adversarial attack and counterfactual explanation, cannot directly satisfy our needs, since their problem settings are different. We propose an efficient, intuitive, and effective method, NOde-Removal-based fAst GNN inference (NORA), which uses the gradient information to approximate the node-removal influence. It only costs one forward propagation and one backpropagation to approximate the influence score for all nodes. Extensive experiments on six datasets and six GNN models verify the effectiveness of NORA. Our code is available at https://github.com/weikai-li/NORA.git.
Paper Structure (26 sections, 5 theorems, 19 equations, 5 figures, 9 tables)

This paper contains 26 sections, 5 theorems, 19 equations, 5 figures, 9 tables.

Key Result

Lemma 1

If removing $v_r$ consistently changes the class distributions of other nodes, its influence defined in Equation equa:influence_definition_node is equal to:

Figures (5)

  • Figure 1: An example of the task-specific influence of node removal in social networks. Red versus Blue represents two different opinions, and color shades represent the degree of opinion. When the top blue node is removed, the two pink nodes might hear less voice from the blue nodes and become red. The two left nodes might no longer follow the middle node, and the left white node might become blue. These are the influence of removing the top blue node.
  • Figure 2: Our schema of calculating node influence. The GNN model is trained on the original graph. We remove a node and apply the trained GNN to the modified graph. We calculate the total variation distance between the original predictions and new predictions as the influence of node removal.
  • Figure 3: Three kinds of influence of node removal: the disappearance of its node embedding; the change of its nearby nodes' aggregation terms; and the spread-out influence to multi-hop neighbors.
  • Figure 4: Relationship between node influence and degree.
  • Figure 5: The relationship between node influence and edge type. "Ratio" means the mean node degree in a certain influence level divided by the total number of edges. Here, the degree and number of edges are separately computed for each edge type.

Theorems & Definitions (5)

  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Lemma 4
  • Lemma 5