PolyFormer: Scalable Node-wise Filters via Polynomial Graph Transformer

Jiahong Ma; Mingguo He; Zhewei Wei

PolyFormer: Scalable Node-wise Filters via Polynomial Graph Transformer

Jiahong Ma, Mingguo He, Zhewei Wei

TL;DR

PolyFormer introduces PolyAttn, an attention-based node-wise filter that operates on polynomial spectral tokens per node, enabling scalable learning of node-specific graph filters without costly Laplacian eigenvectors. By computing polynomial tokens from bases such as Monomial, Bernstein, Chebyshev, and Optimal, and applying tanh-based attention on per-node tokens, PolyAttn yields a node-wise, multi-channel filter that improves expressiveness over node-unified approaches. The full PolyFormer model stacks these blocks to deliver a scalable Graph Transformer for node-level tasks, achieving strong performance on both homophilic and heterophilic graphs, including large-scale graphs with up to 100 million nodes, while maintaining computational efficiency. Empirical results on synthetic and real-world datasets demonstrate superior filter learning, competitive accuracy against transformer-based models, and favorable preprocessing and runtime characteristics, underscoring the method's practical impact for large-scale graph learning.

Abstract

Spectral Graph Neural Networks have demonstrated superior performance in graph representation learning. However, many current methods focus on employing shared polynomial coefficients for all nodes, i.e., learning node-unified filters, which limits the filters' flexibility for node-level tasks. The recent DSF attempts to overcome this limitation by learning node-wise coefficients based on positional encoding. However, the initialization and updating process of the positional encoding are burdensome, hindering scalability on large-scale graphs. In this work, we propose a scalable node-wise filter, PolyAttn. Leveraging the attention mechanism, PolyAttn can directly learn node-wise filters in an efficient manner, offering powerful representation capabilities. Building on PolyAttn, we introduce the whole model, named PolyFormer. In the lens of Graph Transformer models, PolyFormer, which calculates attention scores within nodes, shows great scalability. Moreover, the model captures spectral information, enhancing expressiveness while maintaining efficiency. With these advantages, PolyFormer offers a desirable balance between scalability and expressiveness for node-level tasks. Extensive experiments demonstrate that our proposed methods excel at learning arbitrary node-wise filters, showing superior performance on both homophilic and heterophilic graphs, and handling graphs containing up to 100 million nodes. The code is available at https://github.com/air029/PolyFormer.

PolyFormer: Scalable Node-wise Filters via Polynomial Graph Transformer

TL;DR

Abstract

Paper Structure (33 sections, 3 theorems, 9 equations, 4 figures, 8 tables, 2 algorithms)

This paper contains 33 sections, 3 theorems, 9 equations, 4 figures, 8 tables, 2 algorithms.

Introduction
Background
Notations
Graph Filter
Transformer
PolyFormer
Polynomial Token
PolyAttn and PolyFormer
Theoretical Analysis
Complexity
Connection to Spectral Filtering
Experiments
PolyAttn Experiments
Fitting Signals on Synthetic Datasets.
Performance on Real-world Datasets
...and 18 more sections

Key Result

Theorem 3.1

With polynomial tokens as input, PolyAttn operates as a node-wise filter. Specifically, for the representation $\mathbf{Z}_{i,:} = \sum_{k=0}^{K} \mathbf{H'}^{(i)}_{k,:}$ of node $v_i$ after applying PolyAttn: Here, the coefficients $\alpha_k^{(i)}$ depend not only on the polynomial order $k$ but also on the specific node $v_i$. In other words, PolyAttn performs a node-wise polynomial filter on t

Figures (4)

Figure 1: Illustration of the proposed PolyFormer. For a given graph, polynomial tokens for each node are computed. These tokens are subsequently processed by PolyFormer, which consists of $L$ blocks. Notably, with the defined polynomial token, PolyAttn within each block functions as a node-wise filter in the spectral domain, adaptively learning graph filter specific to each node.
Figure 2: Learned filters of PolyAttn (Cheb).
Figure 3: Filters learned by UniFilter (left) and PolyAttn (right) on the homophilic graph Pubmed (top) and the heterophilic graph Questions (bottom).
Figure 4: Accuracy, training time, and relative maximum GPU memory consumption comparison on Roman-empire.

Theorems & Definitions (4)

Definition 3.1
Theorem 3.1
proposition 1
proposition 2

PolyFormer: Scalable Node-wise Filters via Polynomial Graph Transformer

TL;DR

Abstract

PolyFormer: Scalable Node-wise Filters via Polynomial Graph Transformer

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (4)