Table of Contents
Fetching ...

Efficient Identity and Position Graph Embedding via Spectral-Based Random Feature Aggregation

Meng Qin, Jiahong Liu, Irwin King

TL;DR

The paper addresses the challenge of unsupervised graph embedding that captures pure topology properties (identity and position) while remaining scalable. It proposes Random Feature Aggregation (RFA), a spectral-based, parameter-free GNN backbone fed with random noise and executed via a single forward pass, augmented by a degree correction mechanism that reshapes the graph spectrum. By employing low-pass and high-pass variants, RFA(L) and RFA(H) respectively extract informative node positions and identities, achieving strong embedding quality with superior efficiency compared to baselines. The approach demonstrates scalable, training-free embeddings across diverse graphs, highlighting the practical impact of exploiting graph spectral information for topology-focused representations. Potential future directions include theoretical guarantees, integration with attributed graphs, and automatic model configuration.

Abstract

Graph neural networks (GNNs), which capture graph structures via a feature aggregation mechanism following the graph embedding framework, have demonstrated a powerful ability to support various tasks. According to the topology properties (e.g., structural roles or community memberships of nodes) to be preserved, graph embedding can be categorized into identity and position embedding. However, it is unclear for most GNN-based methods which property they can capture. Some of them may also suffer from low efficiency and scalability caused by several time- and space-consuming procedures (e.g., feature extraction and training). From a perspective of graph signal processing, we find that high- and low-frequency information in the graph spectral domain may characterize node identities and positions, respectively. Based on this investigation, we propose random feature aggregation (RFA) for efficient identity and position embedding, serving as an extreme ablation study regarding GNN feature aggregation. RFA (i) adopts a spectral-based GNN without learnable parameters as its backbone, (ii) only uses random noises as inputs, and (iii) derives embeddings via just one feed-forward propagation (FFP). Inspired by degree-corrected spectral clustering, we further introduce a degree correction mechanism to the GNN backbone. Surprisingly, our experiments demonstrate that two variants of RFA with high- and low-pass filters can respectively derive informative identity and position embeddings via just one FFP (i.e., without any training). As a result, RFA can achieve a better trade-off between quality and efficiency for both identity and position embedding over various baselines.

Efficient Identity and Position Graph Embedding via Spectral-Based Random Feature Aggregation

TL;DR

The paper addresses the challenge of unsupervised graph embedding that captures pure topology properties (identity and position) while remaining scalable. It proposes Random Feature Aggregation (RFA), a spectral-based, parameter-free GNN backbone fed with random noise and executed via a single forward pass, augmented by a degree correction mechanism that reshapes the graph spectrum. By employing low-pass and high-pass variants, RFA(L) and RFA(H) respectively extract informative node positions and identities, achieving strong embedding quality with superior efficiency compared to baselines. The approach demonstrates scalable, training-free embeddings across diverse graphs, highlighting the practical impact of exploiting graph spectral information for topology-focused representations. Potential future directions include theoretical guarantees, integration with attributed graphs, and automatic model configuration.

Abstract

Graph neural networks (GNNs), which capture graph structures via a feature aggregation mechanism following the graph embedding framework, have demonstrated a powerful ability to support various tasks. According to the topology properties (e.g., structural roles or community memberships of nodes) to be preserved, graph embedding can be categorized into identity and position embedding. However, it is unclear for most GNN-based methods which property they can capture. Some of them may also suffer from low efficiency and scalability caused by several time- and space-consuming procedures (e.g., feature extraction and training). From a perspective of graph signal processing, we find that high- and low-frequency information in the graph spectral domain may characterize node identities and positions, respectively. Based on this investigation, we propose random feature aggregation (RFA) for efficient identity and position embedding, serving as an extreme ablation study regarding GNN feature aggregation. RFA (i) adopts a spectral-based GNN without learnable parameters as its backbone, (ii) only uses random noises as inputs, and (iii) derives embeddings via just one feed-forward propagation (FFP). Inspired by degree-corrected spectral clustering, we further introduce a degree correction mechanism to the GNN backbone. Surprisingly, our experiments demonstrate that two variants of RFA with high- and low-pass filters can respectively derive informative identity and position embeddings via just one FFP (i.e., without any training). As a result, RFA can achieve a better trade-off between quality and efficiency for both identity and position embedding over various baselines.

Paper Structure

This paper contains 19 sections, 1 theorem, 9 equations, 9 figures, 9 tables, 1 algorithm.

Key Result

Theorem 3.2

The eigenvalues of a matrix ${\bf{M}} \in \mathbb{C}^{N \times N}$ lie in the union of $N$ discs $(\mathcal{D}_1, \cdots, \mathcal{D}_N)$, where $\mathcal{D}_i := x \in \mathbb{C} : |x - {\bf{M}}_{ii}| \le r_i$; $r_i := \sum\nolimits_{j = 1,j \ne i}^N {|{{\bf{M}}_{ij}}|}$.

Figures (9)

  • Figure 1: An example of node identities and positions as well as ED of graph Laplacian (see Fig. \ref{['Fig:Toy-1-vecs']} for the full eigenvectors), where each color denotes a unique identity; nodes in the same community have similar positions.
  • Figure 2: Distributions of frequencies $\{ \tilde{\lambda}_r \}$ w.r.t. different settings of $\tau$ based on the graph in Fig. \ref{['Fig:Toy']}.
  • Figure 3: Intuition of computing NToS for trade-off analysis.
  • Figure 4: Scalability analysis results of RFA.
  • Figure 5: Parameter analysis w.r.t. $K$ on PPI, Youtube, Europe, and Actor in terms of micro-F1(%).
  • ...and 4 more figures

Theorems & Definitions (9)

  • Definition 2.1: Node Identity
  • Definition 2.2: Node Position
  • Definition 2.3: Graph Embedding
  • Definition 2.4: Graph Convolution
  • Remark 3.1
  • Theorem 3.2: Gershgorin Circle Theorem barany2017gershgorin
  • Remark 3.3
  • Remark 3.4
  • Remark 3.5