Table of Contents
Fetching ...

Robust Subgraph Learning by Monitoring Early Training Representations

Sepideh Neshatfar, Salimeh Yasaei Sekeh

TL;DR

This paper proposes SHERD (Subgraph Learning Hale through Early Training Representation Distances), a novel approach that enhances adversarial robustness while improving overall performance in graph inputs and achieves substantial improvement in robust performance.

Abstract

Graph neural networks (GNNs) have attracted significant attention for their outstanding performance in graph learning and node classification tasks. However, their vulnerability to adversarial attacks, particularly through susceptible nodes, poses a challenge in decision-making. The need for robust graph summarization is evident in adversarial challenges resulting from the propagation of attacks throughout the entire graph. In this paper, we address both performance and adversarial robustness in graph input by introducing the novel technique SHERD (Subgraph Learning Hale through Early Training Representation Distances). SHERD leverages information from layers of a partially trained graph convolutional network (GCN) to detect susceptible nodes during adversarial attacks using standard distance metrics. The method identifies "vulnerable (bad)" nodes and removes such nodes to form a robust subgraph while maintaining node classification performance. Through our experiments, we demonstrate the increased performance of SHERD in enhancing robustness by comparing the network's performance on original and subgraph inputs against various baselines alongside existing adversarial attacks. Our experiments across multiple datasets, including citation datasets such as Cora, Citeseer, and Pubmed, as well as microanatomical tissue structures of cell graphs in the placenta, highlight that SHERD not only achieves substantial improvement in robust performance but also outperforms several baselines in terms of node classification accuracy and computational complexity.

Robust Subgraph Learning by Monitoring Early Training Representations

TL;DR

This paper proposes SHERD (Subgraph Learning Hale through Early Training Representation Distances), a novel approach that enhances adversarial robustness while improving overall performance in graph inputs and achieves substantial improvement in robust performance.

Abstract

Graph neural networks (GNNs) have attracted significant attention for their outstanding performance in graph learning and node classification tasks. However, their vulnerability to adversarial attacks, particularly through susceptible nodes, poses a challenge in decision-making. The need for robust graph summarization is evident in adversarial challenges resulting from the propagation of attacks throughout the entire graph. In this paper, we address both performance and adversarial robustness in graph input by introducing the novel technique SHERD (Subgraph Learning Hale through Early Training Representation Distances). SHERD leverages information from layers of a partially trained graph convolutional network (GCN) to detect susceptible nodes during adversarial attacks using standard distance metrics. The method identifies "vulnerable (bad)" nodes and removes such nodes to form a robust subgraph while maintaining node classification performance. Through our experiments, we demonstrate the increased performance of SHERD in enhancing robustness by comparing the network's performance on original and subgraph inputs against various baselines alongside existing adversarial attacks. Our experiments across multiple datasets, including citation datasets such as Cora, Citeseer, and Pubmed, as well as microanatomical tissue structures of cell graphs in the placenta, highlight that SHERD not only achieves substantial improvement in robust performance but also outperforms several baselines in terms of node classification accuracy and computational complexity.
Paper Structure (19 sections, 11 equations, 10 figures, 2 tables, 1 algorithm)

This paper contains 19 sections, 11 equations, 10 figures, 2 tables, 1 algorithm.

Figures (10)

  • Figure 1: Subphase2: A visualization of the process of attaining the robustness score, $\mathcal{S}_{Robust}$, of node clusters. The input embracing an adversarial node cluster is given to GCN for early representation retrieval. $\mathcal{S}_{Robust}$'s are computed based on (\ref{['Susceptibility score']}) and (${S}^i_{Robust} =-\mathcal{S}^i_{Suscep}$).
  • Figure 2: Subphase 3: A visualization of the process of attaining the performance score, of node clusters. The input, excluding a node cluster, is fed to the GCN for early representation retrieval. The distance between this representation and the corresponding one for the original input is $\mathcal{S}_{Perform}$ for this specific cluster.
  • Figure 3: Overview of the entire methodology, encompassing the partial training of the GCN in Phase I and the subsequent compression process including nodes' clustering (Subphase 1), robustness analysis (Subphase 2), and performance analysis (Subphase 3) in Phase II.
  • Figure 4: Distance Heatmap of Cora (first), Citeseer (second), Mini-Pubmed (third), and Mini-Placenta (fourth)
  • Figure 5: The relationship between three hyperparameters: $\tau$ (training epochs before Phase II), $B$ (number of cluster in balanced K-means' in Phase II-Subphase 1), and C (data compression percentage). The color bar indicates the extent to which each setup outperforms the original accuracy with negative values indicating the Original baseline outperforming ours.
  • ...and 5 more figures