Table of Contents
Fetching ...

PSNE: Efficient Spectral Sparsification Algorithms for Scaling Network Embedding

Longlong Lin, Yunfeng Yu, Zihao Wang, Zeli Wang, Yuying Zhao, Jin Zhao, Tao Jia

TL;DR

The paper tackles the scalability gap in PPR-based network embedding by introducing PSNE, a spectral sparsification framework that directly approximates the full PPR matrix with a Frobenius-norm guarantee. It augments this sparse PPR with a multiple-perspective proximity (MP-PPR) to better capture structural patterns, and then uses randomized SVD to extract high-quality node embeddings. The authors provide theoretical guarantees on sparsification error and computational complexity, and demonstrate through extensive experiments on real and synthetic graphs that PSNE achieves superior efficiency and competitive or superior embedding quality compared with a suite of baselines. This approach enables scalable, structure-aware network embedding suitable for very large graphs while preserving nonlinear representation power. Overall, PSNE offers a principled, scalable alternative to Local Push-based methods with practical impact on graph learning tasks.

Abstract

Network embedding has numerous practical applications and has received extensive attention in graph learning, which aims at mapping vertices into a low-dimensional and continuous dense vector space by preserving the underlying structural properties of the graph. Many network embedding methods have been proposed, among which factorization of the Personalized PageRank (PPR for short) matrix has been empirically and theoretically well supported recently. However, several fundamental issues cannot be addressed. (1) Existing methods invoke a seminal Local Push subroutine to approximate \textit{a single} row or column of the PPR matrix. Thus, they have to execute $n$ ($n$ is the number of nodes) Local Push subroutines to obtain a provable PPR matrix, resulting in prohibitively high computational costs for large $n$. (2) The PPR matrix has limited power in capturing the structural similarity between vertices, leading to performance degradation. To overcome these dilemmas, we propose PSNE, an efficient spectral s\textbf{P}arsification method for \textbf{S}caling \textbf{N}etwork \textbf{E}mbedding, which can fast obtain the embedding vectors that retain strong structural similarities. Specifically, PSNE first designs a matrix polynomial sparser to accelerate the calculation of the PPR matrix, which has a theoretical guarantee in terms of the Frobenius norm. Subsequently, PSNE proposes a simple but effective multiple-perspective strategy to enhance further the representation power of the obtained approximate PPR matrix. Finally, PSNE applies a randomized singular value decomposition algorithm on the sparse and multiple-perspective PPR matrix to get the target embedding vectors. Experimental evaluation of real-world and synthetic datasets shows that our solutions are indeed more efficient, effective, and scalable compared with ten competitors.

PSNE: Efficient Spectral Sparsification Algorithms for Scaling Network Embedding

TL;DR

The paper tackles the scalability gap in PPR-based network embedding by introducing PSNE, a spectral sparsification framework that directly approximates the full PPR matrix with a Frobenius-norm guarantee. It augments this sparse PPR with a multiple-perspective proximity (MP-PPR) to better capture structural patterns, and then uses randomized SVD to extract high-quality node embeddings. The authors provide theoretical guarantees on sparsification error and computational complexity, and demonstrate through extensive experiments on real and synthetic graphs that PSNE achieves superior efficiency and competitive or superior embedding quality compared with a suite of baselines. This approach enables scalable, structure-aware network embedding suitable for very large graphs while preserving nonlinear representation power. Overall, PSNE offers a principled, scalable alternative to Local Push-based methods with practical impact on graph learning tasks.

Abstract

Network embedding has numerous practical applications and has received extensive attention in graph learning, which aims at mapping vertices into a low-dimensional and continuous dense vector space by preserving the underlying structural properties of the graph. Many network embedding methods have been proposed, among which factorization of the Personalized PageRank (PPR for short) matrix has been empirically and theoretically well supported recently. However, several fundamental issues cannot be addressed. (1) Existing methods invoke a seminal Local Push subroutine to approximate \textit{a single} row or column of the PPR matrix. Thus, they have to execute ( is the number of nodes) Local Push subroutines to obtain a provable PPR matrix, resulting in prohibitively high computational costs for large . (2) The PPR matrix has limited power in capturing the structural similarity between vertices, leading to performance degradation. To overcome these dilemmas, we propose PSNE, an efficient spectral s\textbf{P}arsification method for \textbf{S}caling \textbf{N}etwork \textbf{E}mbedding, which can fast obtain the embedding vectors that retain strong structural similarities. Specifically, PSNE first designs a matrix polynomial sparser to accelerate the calculation of the PPR matrix, which has a theoretical guarantee in terms of the Frobenius norm. Subsequently, PSNE proposes a simple but effective multiple-perspective strategy to enhance further the representation power of the obtained approximate PPR matrix. Finally, PSNE applies a randomized singular value decomposition algorithm on the sparse and multiple-perspective PPR matrix to get the target embedding vectors. Experimental evaluation of real-world and synthetic datasets shows that our solutions are indeed more efficient, effective, and scalable compared with ten competitors.
Paper Structure (22 sections, 7 theorems, 11 equations, 6 figures, 5 tables, 2 algorithms)

This paper contains 22 sections, 7 theorems, 11 equations, 6 figures, 5 tables, 2 algorithms.

Key Result

theorem 1

[Sparsifiers of Random-Walk Matrix Polynomials] For any undirected graph $G$ and $0 <\epsilon\leq 0.5$, there exists a matrix $\widetilde{L}$ with $O(n \log n/{{\epsilon}^2})$ non-zeros entries such that for any $x\in\mathbb{R}^{n}$, we have

Figures (6)

  • Figure 1: $v_1$ is the source node and $\alpha=0.15$ is the decay parameter in PPR. The PPR values $\Pi(v_1, v_3)$ and $\Pi(v_1, v_7)$ are 0.054 and 0.140, respectively. The proposed multiple-perspective PPR values $M(v_1, v_3)$ and $M(v_1, v_7)$ are 0.142 and 0.095, respectively (see Table \ref{['tab:structure-aware']} for details).
  • Figure 2: The design of PSNE framework. In Step 1 (i.e., Section \ref{['subsec:sppr']}), PSNE first constructs the sparsifier $\widetilde{L}$ by sampling $N$ paths and assigning weights to the newly sampled edges. Then, Equations \ref{['fomula4']}-\ref{['eq:9']} and $\widetilde{L}$ are applied to directly approximate the PPR matrix, avoiding repeatedly computing each row or column of the PPR matrix in the traditional Local Push method. In Step 2 (i.e., Section \ref{['subsection:mp']}), PSNE devises a multiple-perspective strategy to further enhance the representation of the coarse-grained and sparse PPR matrix obtained by Step 1. In Step 3 (i.e., Section \ref{['subsec:algorithm']}), a randomized singular value decomposition (RSVD) algorithm is executed on the sparse and multiple-perspective PPR proximity matrix to obtain the target embedding matrix.
  • Figure 3: Scalability testing on synthetic graphs.
  • Figure 4: The performance of different network embedding methods
  • Figure 5: Sampling quantity analysis (orange line and blue line represent NetSMF and PSNE, respectively).
  • ...and 1 more figures

Theorems & Definitions (9)

  • definition 1: Random-Walk Matrix Polynomials
  • theorem 1
  • definition 2: Anonymous Walk
  • theorem 2
  • theorem 3
  • theorem 4
  • theorem 5
  • lemma 1
  • lemma 2