Table of Contents
Fetching ...

SlotGAT: Slot-based Message Passing for Heterogeneous Graph Neural Network

Ziang Zhou, Jieming Shi, Renchi Yang, Yuanhang Zou, Qing Li

TL;DR

SlotGAT addresses semantic mixing in heterogeneous graphs by introducing per-type slots and slot-wise message passing, enabling each node-type feature space to evolve separately. It combines slot-specific transformations, edge-type aware slot attention, and a downstream slot attention module to fuse slot information for tasks such as node classification and link prediction. Theoretical analysis shows slot representations converge to type-specific component averages in connected components, supporting the separation of semantics across types. Empirical results across six datasets and 13 baselines demonstrate state-of-the-art performance, with ablations and visualizations confirming the effectiveness of the slot-based design and slot attention. The approach offers a principled method to preserve heterogeneous semantics, with future work aimed at improving scalability with many node types.

Abstract

Heterogeneous graphs are ubiquitous to model complex data. There are urgent needs on powerful heterogeneous graph neural networks to effectively support important applications. We identify a potential semantic mixing issue in existing message passing processes, where the representations of the neighbors of a node $v$ are forced to be transformed to the feature space of $v$ for aggregation, though the neighbors are in different types. That is, the semantics in different node types are entangled together into node $v$'s representation. To address the issue, we propose SlotGAT with separate message passing processes in slots, one for each node type, to maintain the representations in their own node-type feature spaces. Moreover, in a slot-based message passing layer, we design an attention mechanism for effective slot-wise message aggregation. Further, we develop a slot attention technique after the last layer of SlotGAT, to learn the importance of different slots in downstream tasks. Our analysis indicates that the slots in SlotGAT can preserve different semantics in various feature spaces. The superiority of SlotGAT is evaluated against 13 baselines on 6 datasets for node classification and link prediction. Our code is at https://github.com/scottjiao/SlotGAT_ICML23/.

SlotGAT: Slot-based Message Passing for Heterogeneous Graph Neural Network

TL;DR

SlotGAT addresses semantic mixing in heterogeneous graphs by introducing per-type slots and slot-wise message passing, enabling each node-type feature space to evolve separately. It combines slot-specific transformations, edge-type aware slot attention, and a downstream slot attention module to fuse slot information for tasks such as node classification and link prediction. Theoretical analysis shows slot representations converge to type-specific component averages in connected components, supporting the separation of semantics across types. Empirical results across six datasets and 13 baselines demonstrate state-of-the-art performance, with ablations and visualizations confirming the effectiveness of the slot-based design and slot attention. The approach offers a principled method to preserve heterogeneous semantics, with future work aimed at improving scalability with many node types.

Abstract

Heterogeneous graphs are ubiquitous to model complex data. There are urgent needs on powerful heterogeneous graph neural networks to effectively support important applications. We identify a potential semantic mixing issue in existing message passing processes, where the representations of the neighbors of a node are forced to be transformed to the feature space of for aggregation, though the neighbors are in different types. That is, the semantics in different node types are entangled together into node 's representation. To address the issue, we propose SlotGAT with separate message passing processes in slots, one for each node type, to maintain the representations in their own node-type feature spaces. Moreover, in a slot-based message passing layer, we design an attention mechanism for effective slot-wise message aggregation. Further, we develop a slot attention technique after the last layer of SlotGAT, to learn the importance of different slots in downstream tasks. Our analysis indicates that the slots in SlotGAT can preserve different semantics in various feature spaces. The superiority of SlotGAT is evaluated against 13 baselines on 6 datasets for node classification and link prediction. Our code is at https://github.com/scottjiao/SlotGAT_ICML23/.
Paper Structure (23 sections, 1 theorem, 22 equations, 6 figures, 14 tables, 1 algorithm)

This paper contains 23 sections, 1 theorem, 22 equations, 6 figures, 14 tables, 1 algorithm.

Key Result

Theorem 4.1

Given a heterogeneous graph $\mathcal{G}$, for the $t$-slot feature matrix $\mathbf{X}^{(t)}\in \mathbb{R}^{n\times d^t_0}$ in which only nodes of type $t$ have non-zero features, if the graph has no bipartite components, after infinite number of convolution operations $G$, the slot $t$ representat

Figures (6)

  • Figure 1: (a) A heterogeneous graph. (b) The slot-based message passing in the proposed SlotGAT: every node has 3 slots, corresponding to 3 node-type feature spaces. For example, in the input layer, $v_2$'s slot 1 is initialized by its features since $v_2$ is in type 1, while slots 0 and 2 of $v_2$ are empty. The message passing in SlotGAT is slot specific (colored dashed arrows), e.g., neighboring slot messages passed to the slots of the same node type on node $v_2$ in Layer 1. (c) The semantic mixing issue: every node maintains a single representation; in Layer 1, $v_2$ aggregates message from $v_1$ by firstly applying transformation $\tau(0,1)$ to convert $v_1$'s representation in type 0 to the feature space of $v_2$ in type 1, which mixes the two feature spaces.
  • Figure 2: The SlotGAT architecture. (i) SlotGAT initializes every node with multiple slots corresponding to all node types (3 types in the example), and the slot for the node's type is initialized by its transformed features, while the other slots are empty. (ii) In a slot-based message passing layer, slot representations are transformed and propagated separately, with an attention based aggregation technique. (iii) After the last $L$-th layer, SlotGAT includes a slot attention technique to integrate slots for downstream tasks.
  • Figure 3: Attention mechanism in a message passing layer.
  • Figure 4: Slot attention
  • Figure 5: Tsne visualization on all slot representations of $2$-nd SlotGAT layer on IMDB
  • ...and 1 more figures

Theorems & Definitions (2)

  • Theorem 4.1
  • proof