Heterogeneous Attributed Graph Learning via Neighborhood-Aware Star Kernels
Hong Huang, Chengyu Yao, Haiming Chen, Hang Gao
TL;DR
This paper tackles graph learning on attributed graphs by introducing Neighborhood-Aware Star Kernel (NASK), a positive definite kernel that jointly models heterogeneous attribute semantics and neighborhood structure. NASK uses an exponential transformation of the Gower similarity to compute a PD node/edge attribute similarity, and builds star-subgraph kernels enhanced with Weisfeiler-Lehman iterations to capture multi-scale neighborhood information. The authors prove PD for the core components, construct a complete PD kernel K_NAS^{(H)} that aggregates over h-hop star subgraphs, and demonstrate compatibility with SVMs. Empirical results on eleven diverse and four large-scale datasets show that NASK consistently outperforms sixteen baselines, including graph kernels and Graph Neural Networks, with favorable scalability and robustness to attribute perturbations. Overall, NASK provides a principled, scalable framework for attributed graph classification that leverages both attribute semantics and structured neighborhood information.
Abstract
Attributed graphs, typically characterized by irregular topologies and a mix of numerical and categorical attributes, are ubiquitous in diverse domains such as social networks, bioinformatics, and cheminformatics. While graph kernels provide a principled framework for measuring graph similarity, existing kernel methods often struggle to simultaneously capture heterogeneous attribute semantics and neighborhood information in attributed graphs. In this work, we propose the Neighborhood-Aware Star Kernel (NASK), a novel graph kernel designed for attributed graph learning. NASK leverages an exponential transformation of the Gower similarity coefficient to jointly model numerical and categorical features efficiently, and employs star substructures enhanced by Weisfeiler-Lehman iterations to integrate multi-scale neighborhood structural information. We theoretically prove that NASK is positive definite, ensuring compatibility with kernel-based learning frameworks such as SVMs. Extensive experiments are conducted on eleven attributed and four large-scale real-world graph benchmarks. The results demonstrate that NASK consistently achieves superior performance over sixteen state-of-the-art baselines, including nine graph kernels and seven Graph Neural Networks.
