Table of Contents
Fetching ...

LSPI: Heterogeneous Graph Neural Network Classification Aggregation Algorithm Based on Size Neighbor Path Identification

Yufei Zhao, Shiduo Wang, Hua Duan

TL;DR

LSPI tackles the problem of heterogeneous graph neural networks being sensitive to unequal meta-path neighbor counts and noise in large neighbor paths. It introduces a path discriminator to split meta-paths into LargePaths and SmallPaths, applies topology- and feature-based filtering to large-path neighborhoods, and uses intra-path convolutions plus subgraph-level attention to fuse information. Across ACM, IMDB, and Yelp, LSPI outperforms strong baselines, with notable gains on larger, noisier datasets and robust performance under ablation and noise experiments. The work provides actionable guidance on hyperparameters (e.g., threshold $\tau$ and large-path node count $T$) and contributes reproducible code for practitioners.

Abstract

Existing heterogeneous graph neural network algorithms (HGNNs) mostly rely on meta-paths to capture the rich semantic information contained in heterogeneous graphs (also known as heterogeneous information networks (HINs)), but most of these HGNNs focus on different ways of feature aggre gation and ignore the properties of the meta-paths themselves. This paper studies meta-paths in three commonly used data sets and finds that there are huge differences in the number of neighbors connected by different meta paths. At the same time, the noise information contained in large neigh bor paths will have an adverse impact on model performance. Therefore, this paper proposes a Heterogeneous Graph Neural Network Classification and Aggregation Algorithm Based on Large and Small Neighbor Path Iden tification(LSPI). LSPI firstly divides the meta-paths into large and small neighbor paths through the path discriminator , and in order to reduce the noise interference problem in large neighbor paths, LSPI selects neighbor nodes with higher similarity from both topology and feature perspectives, and passes small neighbor paths and filtered large neighbor paths through different graph convolution components. Aggregation is performed to obtain feature information under different subgraphs, and then LSPI uses subgraph level attention to fuse the feature information under different subgraphs to generate the final node embedding. Finally this paper verifies the superiority of the method through extensive experiments and also gives suggestions on the number of nodes to be retained in large neighbor paths through exper iments. The complete reproducible code adn data has been published at: https://github.com/liuhua811/LSPIA.

LSPI: Heterogeneous Graph Neural Network Classification Aggregation Algorithm Based on Size Neighbor Path Identification

TL;DR

LSPI tackles the problem of heterogeneous graph neural networks being sensitive to unequal meta-path neighbor counts and noise in large neighbor paths. It introduces a path discriminator to split meta-paths into LargePaths and SmallPaths, applies topology- and feature-based filtering to large-path neighborhoods, and uses intra-path convolutions plus subgraph-level attention to fuse information. Across ACM, IMDB, and Yelp, LSPI outperforms strong baselines, with notable gains on larger, noisier datasets and robust performance under ablation and noise experiments. The work provides actionable guidance on hyperparameters (e.g., threshold and large-path node count ) and contributes reproducible code for practitioners.

Abstract

Existing heterogeneous graph neural network algorithms (HGNNs) mostly rely on meta-paths to capture the rich semantic information contained in heterogeneous graphs (also known as heterogeneous information networks (HINs)), but most of these HGNNs focus on different ways of feature aggre gation and ignore the properties of the meta-paths themselves. This paper studies meta-paths in three commonly used data sets and finds that there are huge differences in the number of neighbors connected by different meta paths. At the same time, the noise information contained in large neigh bor paths will have an adverse impact on model performance. Therefore, this paper proposes a Heterogeneous Graph Neural Network Classification and Aggregation Algorithm Based on Large and Small Neighbor Path Iden tification(LSPI). LSPI firstly divides the meta-paths into large and small neighbor paths through the path discriminator , and in order to reduce the noise interference problem in large neighbor paths, LSPI selects neighbor nodes with higher similarity from both topology and feature perspectives, and passes small neighbor paths and filtered large neighbor paths through different graph convolution components. Aggregation is performed to obtain feature information under different subgraphs, and then LSPI uses subgraph level attention to fuse the feature information under different subgraphs to generate the final node embedding. Finally this paper verifies the superiority of the method through extensive experiments and also gives suggestions on the number of nodes to be retained in large neighbor paths through exper iments. The complete reproducible code adn data has been published at: https://github.com/liuhua811/LSPIA.
Paper Structure (23 sections, 17 equations, 8 figures, 7 tables, 1 algorithm)

This paper contains 23 sections, 17 equations, 8 figures, 7 tables, 1 algorithm.

Figures (8)

  • Figure 1: Example of a heterogeneous graph (ACM). (a) A heterogeneous graph ACM composed of three node types, where P denotes paper, A denotes author, and S denotes subject; (b) Several meta-paths with different semantics and different lengths in the heterogeneous graph ACM; (c) Neighbors based on the three meta-paths; for simplicity, the following content only expresses meta-paths in the order of node connections, such as P-A-P abbreviated as PAP.
  • Figure 2: Mean difference in the number of node neighbors under different meta-paths.
  • Figure 3: Accuracy of HAN with different meta-paths.
  • Figure 4: LSPI framework structure.
  • Figure 5: Visualization experiments for node embedding on ACM datasets.
  • ...and 3 more figures

Theorems & Definitions (4)

  • Definition 1: Heterogeneous Graph bib10
  • Definition 2: Meta-paths bib10
  • Definition 3: Meta-path-based Neighbors bib11
  • Definition 4: Meta-path-based Neighbors bib22