Table of Contents
Fetching ...

Heterogeneous Attributed Graph Learning via Neighborhood-Aware Star Kernels

Hong Huang, Chengyu Yao, Haiming Chen, Hang Gao

TL;DR

This paper tackles graph learning on attributed graphs by introducing Neighborhood-Aware Star Kernel (NASK), a positive definite kernel that jointly models heterogeneous attribute semantics and neighborhood structure. NASK uses an exponential transformation of the Gower similarity to compute a PD node/edge attribute similarity, and builds star-subgraph kernels enhanced with Weisfeiler-Lehman iterations to capture multi-scale neighborhood information. The authors prove PD for the core components, construct a complete PD kernel K_NAS^{(H)} that aggregates over h-hop star subgraphs, and demonstrate compatibility with SVMs. Empirical results on eleven diverse and four large-scale datasets show that NASK consistently outperforms sixteen baselines, including graph kernels and Graph Neural Networks, with favorable scalability and robustness to attribute perturbations. Overall, NASK provides a principled, scalable framework for attributed graph classification that leverages both attribute semantics and structured neighborhood information.

Abstract

Attributed graphs, typically characterized by irregular topologies and a mix of numerical and categorical attributes, are ubiquitous in diverse domains such as social networks, bioinformatics, and cheminformatics. While graph kernels provide a principled framework for measuring graph similarity, existing kernel methods often struggle to simultaneously capture heterogeneous attribute semantics and neighborhood information in attributed graphs. In this work, we propose the Neighborhood-Aware Star Kernel (NASK), a novel graph kernel designed for attributed graph learning. NASK leverages an exponential transformation of the Gower similarity coefficient to jointly model numerical and categorical features efficiently, and employs star substructures enhanced by Weisfeiler-Lehman iterations to integrate multi-scale neighborhood structural information. We theoretically prove that NASK is positive definite, ensuring compatibility with kernel-based learning frameworks such as SVMs. Extensive experiments are conducted on eleven attributed and four large-scale real-world graph benchmarks. The results demonstrate that NASK consistently achieves superior performance over sixteen state-of-the-art baselines, including nine graph kernels and seven Graph Neural Networks.

Heterogeneous Attributed Graph Learning via Neighborhood-Aware Star Kernels

TL;DR

This paper tackles graph learning on attributed graphs by introducing Neighborhood-Aware Star Kernel (NASK), a positive definite kernel that jointly models heterogeneous attribute semantics and neighborhood structure. NASK uses an exponential transformation of the Gower similarity to compute a PD node/edge attribute similarity, and builds star-subgraph kernels enhanced with Weisfeiler-Lehman iterations to capture multi-scale neighborhood information. The authors prove PD for the core components, construct a complete PD kernel K_NAS^{(H)} that aggregates over h-hop star subgraphs, and demonstrate compatibility with SVMs. Empirical results on eleven diverse and four large-scale datasets show that NASK consistently outperforms sixteen baselines, including graph kernels and Graph Neural Networks, with favorable scalability and robustness to attribute perturbations. Overall, NASK provides a principled, scalable framework for attributed graph classification that leverages both attribute semantics and structured neighborhood information.

Abstract

Attributed graphs, typically characterized by irregular topologies and a mix of numerical and categorical attributes, are ubiquitous in diverse domains such as social networks, bioinformatics, and cheminformatics. While graph kernels provide a principled framework for measuring graph similarity, existing kernel methods often struggle to simultaneously capture heterogeneous attribute semantics and neighborhood information in attributed graphs. In this work, we propose the Neighborhood-Aware Star Kernel (NASK), a novel graph kernel designed for attributed graph learning. NASK leverages an exponential transformation of the Gower similarity coefficient to jointly model numerical and categorical features efficiently, and employs star substructures enhanced by Weisfeiler-Lehman iterations to integrate multi-scale neighborhood structural information. We theoretically prove that NASK is positive definite, ensuring compatibility with kernel-based learning frameworks such as SVMs. Extensive experiments are conducted on eleven attributed and four large-scale real-world graph benchmarks. The results demonstrate that NASK consistently achieves superior performance over sixteen state-of-the-art baselines, including nine graph kernels and seven Graph Neural Networks.

Paper Structure

This paper contains 48 sections, 6 theorems, 31 equations, 6 figures, 6 tables, 2 algorithms.

Key Result

Lemma 1

Let $s_d(x_d, x_d')$ be a symmetric similarity function defined on the $d$-th attribute. Define $f_d(x_d, x_d') := 1 - s_d(x_d, x_d')$. Then under the constructions of $s_d$ for numerical, categorical and binary attributes in Gower similarity coefficient (see details in definition definition: gower)

Figures (6)

  • Figure 1: Overview of our proposed NASK. Given two attributed graphs $\mathrm{G}$ and $\mathrm{G}'$ in the yellow box. The attribute information of the graph $\mathrm{G}$ is presented in the red box. Step illustrates the similarity measurement between attributed star subgraphs. Step demonstrates the construction of the graph kernel$K_\mathrm{S}$ based on these star subgraphs, denoted as $S_i$. Finally, Step defines the NASK, formally denoted as $K_{\mathrm{NAS}}$, by integrating the methods from the previous two steps along with WL algorithm to expand each star subgraph with its one-hop neighborhood. For example, the expanded star graph from $S_1$ is represented as $\{S_1, \{S_2, S_3, S_4, S_5, S_6\}\}$. We present a simple example for illustration, though our method is general and not limited to this case.
  • Figure 2: Impact of depths of WL iteration on classification accuracy and runtime of NASK across four datasets.
  • Figure 3: Visualization of multi-hop star subgraphs centered on key nodes.
  • Figure 4: Performance of NASK and its ablation variants on several additional datasets
  • Figure 5: Accuracy of NASK under attribute perturbations.
  • ...and 1 more figures

Theorems & Definitions (21)

  • Definition 1: Attributed Graph
  • Example 1
  • Definition 2: Attributed Star Graph
  • Example 2
  • Definition 3: Gower similarity coefficient
  • Definition 4: Normalized numerical similarity (Gower)
  • Definition 5: Normalized categorical similarity (Gower)
  • Lemma 1
  • Lemma 2
  • Theorem 1
  • ...and 11 more