GrokFormer: Graph Fourier Kolmogorov-Arnold Transformers
Guoguo Ai, Guansong Pang, Hezhe Qiao, Yuan Gao, Hui Yan
TL;DR
This work tackles the limited ability of Graph Transformers to capture high-frequency graph signals caused by the low-pass bias of self-attention. It introduces GrokFormer, which uses a Graph Fourier Kolmogorov-Arnold Network (GFKAN) to learn an order- and spectrum-adaptive spectral filter $h(\lambda)$ through Fourier-series activations across a $K$-order spectrum, combined with an efficient self-attention path. The approach yields superior expressiveness and scalability, backed by theoretical results and extensive experiments on 10 node-classification datasets and 5 graph-classification datasets, where GrokFormer consistently outperforms state-of-the-art GTs and GNNs. The work provides a practical, highly expressive GT framework with a public implementation, offering significant potential for improved graph representation learning across diverse domains.
Abstract
Graph Transformers (GTs) have demonstrated remarkable performance in graph representation learning over popular graph neural networks (GNNs). However, self--attention, the core module of GTs, preserves only low-frequency signals in graph features, leading to ineffectiveness in capturing other important signals like high-frequency ones. Some recent GT models help alleviate this issue, but their flexibility and expressiveness are still limited since the filters they learn are fixed on predefined graph spectrum or spectral order. To tackle this challenge, we propose a Graph Fourier Kolmogorov-Arnold Transformer (GrokFormer), a novel GT model that learns highly expressive spectral filters with adaptive graph spectrum and spectral order through a Fourier series modeling over learnable activation functions. We demonstrate theoretically and empirically that the proposed GrokFormer filter offers better expressiveness than other spectral methods. Comprehensive experiments on 10 real-world node classification datasets across various domains, scales, and graph properties, as well as 5 graph classification datasets, show that GrokFormer outperforms state-of-the-art GTs and GNNs. Our code is available at https://github.com/GGA23/GrokFormer
