Table of Contents
Fetching ...

Layer-diverse Negative Sampling for Graph Neural Networks

Wei Duan, Jie Lu, Yu Guang Wang, Junyu Xuan

TL;DR

This paper tackles the shortcomings of traditional GNNs that rely solely on positive samples, which can lead to over-smoothing and limited expressivity, and addresses over-squashing by introducing Layer-diverse Negative Sampling (LDNS). LDNS leverages a DPP-based sampling matrix and a space-squeezing operation to generate diverse negative samples that span across layers, with a shortest-path candidate set to keep computation tractable; a layer-aware, k-DPP sampling step ensures reduced redundancy between layers. Empirically, LDNS improves node classification accuracy across seven homophilous and three heterophilous datasets, reduces cross-layer overlap in negative samples, and demonstrates robustness across multiple GNN architectures, while offering a favorable trade-off between performance and time complexity. The work shows that dynamically adding carefully chosen negative samples effectively rewires information flow, enhancing GNN expressivity and reducing over-squashing with practical computational considerations.

Abstract

Graph neural networks (GNNs) are a powerful solution for various structure learning applications due to their strong representation capabilities for graph data. However, traditional GNNs, relying on message-passing mechanisms that gather information exclusively from first-order neighbours (known as positive samples), can lead to issues such as over-smoothing and over-squashing. To mitigate these issues, we propose a layer-diverse negative sampling method for message-passing propagation. This method employs a sampling matrix within a determinantal point process, which transforms the candidate set into a space and selectively samples from this space to generate negative samples. To further enhance the diversity of the negative samples during each forward pass, we develop a space-squeezing method to achieve layer-wise diversity in multi-layer GNNs. Experiments on various real-world graph datasets demonstrate the effectiveness of our approach in improving the diversity of negative samples and overall learning performance. Moreover, adding negative samples dynamically changes the graph's topology, thus with the strong potential to improve the expressiveness of GNNs and reduce the risk of over-squashing.

Layer-diverse Negative Sampling for Graph Neural Networks

TL;DR

This paper tackles the shortcomings of traditional GNNs that rely solely on positive samples, which can lead to over-smoothing and limited expressivity, and addresses over-squashing by introducing Layer-diverse Negative Sampling (LDNS). LDNS leverages a DPP-based sampling matrix and a space-squeezing operation to generate diverse negative samples that span across layers, with a shortest-path candidate set to keep computation tractable; a layer-aware, k-DPP sampling step ensures reduced redundancy between layers. Empirically, LDNS improves node classification accuracy across seven homophilous and three heterophilous datasets, reduces cross-layer overlap in negative samples, and demonstrates robustness across multiple GNN architectures, while offering a favorable trade-off between performance and time complexity. The work shows that dynamically adding carefully chosen negative samples effectively rewires information flow, enhancing GNN expressivity and reducing over-squashing with practical computational considerations.

Abstract

Graph neural networks (GNNs) are a powerful solution for various structure learning applications due to their strong representation capabilities for graph data. However, traditional GNNs, relying on message-passing mechanisms that gather information exclusively from first-order neighbours (known as positive samples), can lead to issues such as over-smoothing and over-squashing. To mitigate these issues, we propose a layer-diverse negative sampling method for message-passing propagation. This method employs a sampling matrix within a determinantal point process, which transforms the candidate set into a space and selectively samples from this space to generate negative samples. To further enhance the diversity of the negative samples during each forward pass, we develop a space-squeezing method to achieve layer-wise diversity in multi-layer GNNs. Experiments on various real-world graph datasets demonstrate the effectiveness of our approach in improving the diversity of negative samples and overall learning performance. Moreover, adding negative samples dynamically changes the graph's topology, thus with the strong potential to improve the expressiveness of GNNs and reduce the risk of over-squashing.
Paper Structure (29 sections, 41 equations, 10 figures, 19 tables, 4 algorithms)

This paper contains 29 sections, 41 equations, 10 figures, 19 tables, 4 algorithms.

Figures (10)

  • Figure 1: Negative samples from layer-diverse DPP sampling. (a) For a given node in a graph, its first-order neighbours can be thought of as positive samples, despite the fact that these neighbours may belong to different clusters. (b) Algorithm \ref{['alg:spath']} calculates the shortest path from a given node to other nodes in the graph to obtain smaller, yet more efficient candidate sets for further sampling. (c) As the candidate set is significantly larger than the number of negative samples needed, the ideal subset of negative samples is not unique. By using the layer-diverse DPP sampling method to select negative samples, it is possible to include as much information from the entire graph as possible while also reducing redundancy among negative samples in different layers.
  • Figure 2: Illustration of the layer-diverse sampling process. (a) In the candidate set with 3 nodes, construct the $\boldsymbol{V}^{3\times3}$. The original space is spanned by the eigenvectors ${\boldsymbol{v}_1, \boldsymbol{v}_2, \boldsymbol{v}_3}$ and every node in the candidate set corresponds to a coloured vector in this space. (b) Suppose node 1 (green vector) is selected in the last layer, which has the greatest impact on the $\boldsymbol{v}_1/\boldsymbol{V}[:,1]$, we then squeeze the space along the $\boldsymbol{V}[:,1]$ direction. If the sign of another node in $\boldsymbol{V}[:,1]$ projection is the same as the green one, the re-scale direction will be the same (the orange vector) and vice versa (the blue vector). (c) This operation will result in a new space, where the component $\boldsymbol{V}[:,1]$ is significantly cut-off, which means the probability of picking the corresponding node 1 has been reduced.
  • Figure 3: Case 1: Adding negative samples can help GNN learn different embedding for different structures. Dash lines mean adding negative samples. (a) After adding negative samples, MAX can distinguish different structures. (b) After adding negative samples, MAX and MEAN can distinguish different structures.
  • Figure 4: Case 2: Although for layer $l-1$, MAX and MEAN aggregators still can not distinguish different structures after adding negative samples. Since the layer-diverse method can obtain different samples from the last layer, for layer $l$, adding negative samples lets MAX and MEAN aggregators to be able to distinguish different structures.
  • Figure 5: Case 3: (a) Aggregators in the original graph can distinguish different structures. (b) Under the specific condition, adding negative samples has a small probability of preventing that. (c) Even if this situation occurs, the layer-diverse approach will address this in the next layer.
  • ...and 5 more figures

Theorems & Definitions (6)

  • Remark 3.1
  • Remark 3.2
  • Remark A.1
  • proof
  • Remark A.2
  • proof