Table of Contents
Fetching ...

GSTAM: Efficient Graph Distillation with Structural Attention-Matching

Arash Rasti-Meymandi, Ahmad Sajedi, Zhaopan Xu, Konstantinos N. Plataniotis

TL;DR

GSTAM tackles graph dataset condensation for graph classification by distilling structural information via structural attention matching. It leverages per-layer GNN attention maps to guide synthetic graph generation, avoiding bi-level optimization and improving efficiency. The method introduces STAM with a mapping from layer features to attention tensors, alongside L_STAM and L_reg losses, plus learnable adjacency logits to shape synthetic graphs, and demonstrates superior performance and cross-architecture generalization across benchmarks. This attention-driven distillation offers practical impact for scalable, accurate graph classification on large datasets and under extreme condensation ratios.

Abstract

Graph distillation has emerged as a solution for reducing large graph datasets to smaller, more manageable, and informative ones. Existing methods primarily target node classification, involve computationally intensive processes, and fail to capture the true distribution of the full graph dataset. To address these issues, we introduce Graph Distillation with Structural Attention Matching (GSTAM), a novel method for condensing graph classification datasets. GSTAM leverages the attention maps of GNNs to distill structural information from the original dataset into synthetic graphs. The structural attention-matching mechanism exploits the areas of the input graph that GNNs prioritize for classification, effectively distilling such information into the synthetic graphs and improving overall distillation performance. Comprehensive experiments demonstrate GSTAM's superiority over existing methods, achieving 0.45% to 6.5% better performance in extreme condensation ratios, highlighting its potential use in advancing distillation for graph classification tasks (Code available at https://github.com/arashrasti96/GSTAM).

GSTAM: Efficient Graph Distillation with Structural Attention-Matching

TL;DR

GSTAM tackles graph dataset condensation for graph classification by distilling structural information via structural attention matching. It leverages per-layer GNN attention maps to guide synthetic graph generation, avoiding bi-level optimization and improving efficiency. The method introduces STAM with a mapping from layer features to attention tensors, alongside L_STAM and L_reg losses, plus learnable adjacency logits to shape synthetic graphs, and demonstrates superior performance and cross-architecture generalization across benchmarks. This attention-driven distillation offers practical impact for scalable, accurate graph classification on large datasets and under extreme condensation ratios.

Abstract

Graph distillation has emerged as a solution for reducing large graph datasets to smaller, more manageable, and informative ones. Existing methods primarily target node classification, involve computationally intensive processes, and fail to capture the true distribution of the full graph dataset. To address these issues, we introduce Graph Distillation with Structural Attention Matching (GSTAM), a novel method for condensing graph classification datasets. GSTAM leverages the attention maps of GNNs to distill structural information from the original dataset into synthetic graphs. The structural attention-matching mechanism exploits the areas of the input graph that GNNs prioritize for classification, effectively distilling such information into the synthetic graphs and improving overall distillation performance. Comprehensive experiments demonstrate GSTAM's superiority over existing methods, achieving 0.45% to 6.5% better performance in extreme condensation ratios, highlighting its potential use in advancing distillation for graph classification tasks (Code available at https://github.com/arashrasti96/GSTAM).
Paper Structure (22 sections, 4 equations, 3 figures, 4 tables, 1 algorithm)

This paper contains 22 sections, 4 equations, 3 figures, 4 tables, 1 algorithm.

Figures (3)

  • Figure 1: Motivating Example: (a) The attention maps over different levels of a network trained for face recognition that indicates the focus of the network on the particular input image zagoruyko2016paying. The brighter the color, the greater the network's focus on that specific part of the image. (b) An input graph and its corresponding structural attention map created using a technique similar to Grad-CAM selvaraju2017grad can reveal where different layers of a trained GNN focus to classify the given graph. This information is valuable when distilling a graph dataset, as it highlights the areas of the input graph that the GNN prioritizes for classification. A darker red color represents higher attention.
  • Figure 2: Overveiw of GSTAM:GSTAM matches the structural attention maps of different layers of a GNN model trained on the full and the synthetic graph dataset, respectively along with the reg loss to account for the final layer of the GNN model.
  • Figure 3: The effect of different parameters on the ogbg-molbbbp and ogbg-molhiv datasets; (a,b) illustrate the impact of the trade-off parameter $\lambda$ used in \ref{['eq:loss']}, while (c,d) demonstrate the effect of the parameter $p$ used in \ref{['eq:attention']}.