Table of Contents
Fetching ...

How Powerful is Graph Filtering for Recommendation

Shaowen Peng, Xin Liu, Kazunari Sugiyama, Tsunenori Mine

TL;DR

This paper investigates how graph filtering powers affect collaborative-filtering-based recommendations, identifying two core limitations: generality across data densities and expressive power of linear GCNs. It proposes Generalized Graph Normalization (G^2N) to sharpen the graph spectrum, and Individualized Graph Filtering (IGF) to enable per-dimension embedding flexibility, culminating in the SGFCF method that uses top-$K$ spectral components for efficient predictions. The authors provide theoretical insights showing the limitations of shared filters for multi-dimensional embeddings and empirically demonstrate that SGFCF consistently outperforms baselines across four diverse datasets, with substantial speedups. The work offers a practical, density-robust approach to graph-based recommendation, enabling accurate and scalable recommendations without heavy supervised training in many settings.

Abstract

It has been shown that the effectiveness of graph convolutional network (GCN) for recommendation is attributed to the spectral graph filtering. Most GCN-based methods consist of a graph filter or followed by a low-rank mapping optimized based on supervised training. However, we show two limitations suppressing the power of graph filtering: (1) Lack of generality. Due to the varied noise distribution, graph filters fail to denoise sparse data where noise is scattered across all frequencies, while supervised training results in worse performance on dense data where noise is concentrated in middle frequencies that can be removed by graph filters without training. (2) Lack of expressive power. We theoretically show that linear GCN (LGCN) that is effective on collaborative filtering (CF) cannot generate arbitrary embeddings, implying the possibility that optimal data representation might be unreachable. To tackle the first limitation, we show close relation between noise distribution and the sharpness of spectrum where a sharper spectral distribution is more desirable causing data noise to be separable from important features without training. Based on this observation, we propose a generalized graph normalization G^2N to adjust the sharpness of spectral distribution in order to redistribute data noise to assure that it can be removed by graph filtering without training. As for the second limitation, we propose an individualized graph filter (IGF) adapting to the different confidence levels of the user preference that interactions can reflect, which is proved to be able to generate arbitrary embeddings. By simplifying LGCN, we further propose a simplified graph filtering (SGFCF) which only requires the top-K singular values for recommendation. Finally, experimental results on four datasets with different density settings demonstrate the effectiveness and efficiency of our proposed methods.

How Powerful is Graph Filtering for Recommendation

TL;DR

This paper investigates how graph filtering powers affect collaborative-filtering-based recommendations, identifying two core limitations: generality across data densities and expressive power of linear GCNs. It proposes Generalized Graph Normalization (G^2N) to sharpen the graph spectrum, and Individualized Graph Filtering (IGF) to enable per-dimension embedding flexibility, culminating in the SGFCF method that uses top- spectral components for efficient predictions. The authors provide theoretical insights showing the limitations of shared filters for multi-dimensional embeddings and empirically demonstrate that SGFCF consistently outperforms baselines across four diverse datasets, with substantial speedups. The work offers a practical, density-robust approach to graph-based recommendation, enabling accurate and scalable recommendations without heavy supervised training in many settings.

Abstract

It has been shown that the effectiveness of graph convolutional network (GCN) for recommendation is attributed to the spectral graph filtering. Most GCN-based methods consist of a graph filter or followed by a low-rank mapping optimized based on supervised training. However, we show two limitations suppressing the power of graph filtering: (1) Lack of generality. Due to the varied noise distribution, graph filters fail to denoise sparse data where noise is scattered across all frequencies, while supervised training results in worse performance on dense data where noise is concentrated in middle frequencies that can be removed by graph filters without training. (2) Lack of expressive power. We theoretically show that linear GCN (LGCN) that is effective on collaborative filtering (CF) cannot generate arbitrary embeddings, implying the possibility that optimal data representation might be unreachable. To tackle the first limitation, we show close relation between noise distribution and the sharpness of spectrum where a sharper spectral distribution is more desirable causing data noise to be separable from important features without training. Based on this observation, we propose a generalized graph normalization G^2N to adjust the sharpness of spectral distribution in order to redistribute data noise to assure that it can be removed by graph filtering without training. As for the second limitation, we propose an individualized graph filter (IGF) adapting to the different confidence levels of the user preference that interactions can reflect, which is proved to be able to generate arbitrary embeddings. By simplifying LGCN, we further propose a simplified graph filtering (SGFCF) which only requires the top-K singular values for recommendation. Finally, experimental results on four datasets with different density settings demonstrate the effectiveness and efficiency of our proposed methods.
Paper Structure (44 sections, 6 theorems, 37 equations, 6 figures, 10 tables)

This paper contains 44 sections, 6 theorems, 37 equations, 6 figures, 10 tables.

Key Result

Theorem 1

Given $\mathbf{P}$ and $\mathbf{Q}$ as the left and right singular vectors of $\mathbf{\hat{R}}$, we have the following relation: Particularly, given a eigenvalue $\lambda_k>0$ with eigenvector $\mathbf{v}_k=[\mathbf{p}_k, \mathbf{q}_k]^T$, there always exists a $-\lambda_k$ with corresponding eigenvector $[\mathbf{p}_k, -\mathbf{q}_k]^T$.

Figures (6)

  • Figure 1: The accuracy (Recall@20) of SGF and LGCN when only considering top-$K$% low frequencies.
  • Figure 2: (a) Spectral distribution of CiteULike with different density settings. (b) the eigenvalue ratio ($\lambda'_k/\lambda_k$) of $x=\{40,60,80,100\}$ ($\lambda'_k$) to $x=20$ ($\lambda_k$).
  • Figure 3: Eigenvalue ratio of $\mathbf{\tilde{A}}$ to $\mathbf{\hat{A}}$ with varying $\alpha$ and $\epsilon$ on CiteULike.
  • Figure 4: How performance changes with varying $\alpha$ and $\epsilon$.
  • Figure 5: How performance changes with varying $\gamma$.
  • ...and 1 more figures

Theorems & Definitions (14)

  • Definition 1
  • Theorem 1
  • Corollary 1
  • Theorem 2
  • Theorem 3
  • Definition 2
  • Theorem 4
  • Definition 3
  • Theorem 5
  • proof
  • ...and 4 more