Table of Contents
Fetching ...

Cross-Space Adaptive Filter: Integrating Graph Topology and Node Attributes for Alleviating the Over-smoothing Problem

Chen Huang, Haoyang Li, Yifan Zhang, Wenqiang Lei, Jiancheng Lv

TL;DR

The paper tackles the over-smoothing problem in deep Graph Convolutional Networks by proposing Cross-Space Filter (CSF), which integrates topology-derived low-pass filtering with attribute-derived high-pass filtering. It achieves this by deriving an attribute-based high-pass filter via semi-supervised kernel ridge regression and recasting the topology-based low-pass filter as a Mercer kernel, then unifying them through a simple multiple-kernel learning formulation. The resulting cross-space kernel, $\mathbb{K}$, is used in a propagation rule that concatenates raw node attributes, enabling effective information fusion across spaces. Empirical results on nine datasets show CSF consistently improves performance, particularly on disassortative graphs, and demonstrates robustness to increasing network depth. The work provides a new perspective on the role of node attributes and kernels in alleviating over-smoothing and offers a practical framework for cross-space spectral filter design.

Abstract

The vanilla Graph Convolutional Network (GCN) uses a low-pass filter to extract low-frequency signals from graph topology, which may lead to the over-smoothing problem when GCN goes deep. To this end, various methods have been proposed to create an adaptive filter by incorporating an extra filter (e.g., a high-pass filter) extracted from the graph topology. However, these methods heavily rely on topological information and ignore the node attribute space, which severely sacrifices the expressive power of the deep GCNs, especially when dealing with disassortative graphs. In this paper, we propose a cross-space adaptive filter, called CSF, to produce the adaptive-frequency information extracted from both the topology and attribute spaces. Specifically, we first derive a tailored attribute-based high-pass filter that can be interpreted theoretically as a minimizer for semi-supervised kernel ridge regression. Then, we cast the topology-based low-pass filter as a Mercer's kernel within the context of GCNs. This serves as a foundation for combining it with the attribute-based filter to capture the adaptive-frequency information. Finally, we derive the cross-space filter via an effective multiple-kernel learning strategy, which unifies the attribute-based high-pass filter and the topology-based low-pass filter. This helps to address the over-smoothing problem while maintaining effectiveness. Extensive experiments demonstrate that CSF not only successfully alleviates the over-smoothing problem but also promotes the effectiveness of the node classification task.

Cross-Space Adaptive Filter: Integrating Graph Topology and Node Attributes for Alleviating the Over-smoothing Problem

TL;DR

The paper tackles the over-smoothing problem in deep Graph Convolutional Networks by proposing Cross-Space Filter (CSF), which integrates topology-derived low-pass filtering with attribute-derived high-pass filtering. It achieves this by deriving an attribute-based high-pass filter via semi-supervised kernel ridge regression and recasting the topology-based low-pass filter as a Mercer kernel, then unifying them through a simple multiple-kernel learning formulation. The resulting cross-space kernel, , is used in a propagation rule that concatenates raw node attributes, enabling effective information fusion across spaces. Empirical results on nine datasets show CSF consistently improves performance, particularly on disassortative graphs, and demonstrates robustness to increasing network depth. The work provides a new perspective on the role of node attributes and kernels in alleviating over-smoothing and offers a practical framework for cross-space spectral filter design.

Abstract

The vanilla Graph Convolutional Network (GCN) uses a low-pass filter to extract low-frequency signals from graph topology, which may lead to the over-smoothing problem when GCN goes deep. To this end, various methods have been proposed to create an adaptive filter by incorporating an extra filter (e.g., a high-pass filter) extracted from the graph topology. However, these methods heavily rely on topological information and ignore the node attribute space, which severely sacrifices the expressive power of the deep GCNs, especially when dealing with disassortative graphs. In this paper, we propose a cross-space adaptive filter, called CSF, to produce the adaptive-frequency information extracted from both the topology and attribute spaces. Specifically, we first derive a tailored attribute-based high-pass filter that can be interpreted theoretically as a minimizer for semi-supervised kernel ridge regression. Then, we cast the topology-based low-pass filter as a Mercer's kernel within the context of GCNs. This serves as a foundation for combining it with the attribute-based filter to capture the adaptive-frequency information. Finally, we derive the cross-space filter via an effective multiple-kernel learning strategy, which unifies the attribute-based high-pass filter and the topology-based low-pass filter. This helps to address the over-smoothing problem while maintaining effectiveness. Extensive experiments demonstrate that CSF not only successfully alleviates the over-smoothing problem but also promotes the effectiveness of the node classification task.
Paper Structure (17 sections, 3 theorems, 6 equations, 7 figures, 6 tables, 1 algorithm)

This paper contains 17 sections, 3 theorems, 6 equations, 7 figures, 6 tables, 1 algorithm.

Key Result

proposition 1

$\hat{K}= I - \Gamma(K, a_3)$ is a valid kernel if and only if $a_3 > 0$. Also, $K_{attr}$ is a valid kernel if and only if $a_3 > 0$ and $a_2 > 0$.

Figures (7)

  • Figure 1: Overview of CSF. We leverage both the graph topology and node attribute spaces to produce a cross-space adaptive filter for alleviating the over-smoothing problem and improving the effectiveness of deep GCNs.
  • Figure 2: Illustration of filter function $g(\lambda_i)$ of $K_{attr}$. This demonstrates that $K_{attr}$ corresponds to a high-pass filter.
  • Figure 3: Illustration of magnitude spectrum and frequency response. This demonstrates that $\mathbb{K}$ corresponds to an adaptive filter that combines both low-pass and high-pass filters.
  • Figure 4: Over-smoothing problem evaluation of adaptive filter-based methods. We tune the number of layers of each method (i.e., the X-axis). The Y-axis represents the classification accuracy (%). This indicates that CSF outperforms others in terms of its robustness to over-smoothing problem and its effectiveness on downstream tasks.
  • Figure 5: Ablation studies on node attribute space. We report the averaged performance across various number of layers.
  • ...and 2 more figures

Theorems & Definitions (5)

  • proposition 1
  • proposition 2
  • Remark 1
  • Corollary 1
  • Remark 2