Table of Contents
Fetching ...

Is Graph Convolution Always Beneficial For Every Feature?

Yilun Zheng, Xiang Li, Sitao Luan, Xiaojiang Peng, Lihui Chen

TL;DR

Compared to original GNNs, GFS significantly improves the extraction of useful topological information from each feature with comparable computational costs, and is robust to hyperparameter tuning, highlighting its potential as a universal method for enhancing various GNN architectures.

Abstract

Graph Neural Networks (GNNs) have demonstrated strong capabilities in processing structured data. While traditional GNNs typically treat each feature dimension equally during graph convolution, we raise an important question: Is the graph convolution operation equally beneficial for each feature? If not, the convolution operation on certain feature dimensions can possibly lead to harmful effects, even worse than the convolution-free models. In prior studies, to assess the impacts of graph convolution on features, people proposed metrics based on feature homophily to measure feature consistency with the graph topology. However, these metrics have shown unsatisfactory alignment with GNN performance and have not been effectively employed to guide feature selection in GNNs. To address these limitations, we introduce a novel metric, Topological Feature Informativeness (TFI), to distinguish between GNN-favored and GNN-disfavored features, where its effectiveness is validated through both theoretical analysis and empirical observations. Based on TFI, we propose a simple yet effective Graph Feature Selection (GFS) method, which processes GNN-favored and GNN-disfavored features separately, using GNNs and non-GNN models. Compared to original GNNs, GFS significantly improves the extraction of useful topological information from each feature with comparable computational costs. Extensive experiments show that after applying GFS to 8 baseline and state-of-the-art (SOTA) GNN architectures across 10 datasets, 83.75% of the GFS-augmented cases show significant performance boosts. Furthermore, our proposed TFI metric outperforms other feature selection methods. These results validate the effectiveness of both GFS and TFI. Additionally, we demonstrate that GFS's improvements are robust to hyperparameter tuning, highlighting its potential as a universal method for enhancing various GNN architectures.

Is Graph Convolution Always Beneficial For Every Feature?

TL;DR

Compared to original GNNs, GFS significantly improves the extraction of useful topological information from each feature with comparable computational costs, and is robust to hyperparameter tuning, highlighting its potential as a universal method for enhancing various GNN architectures.

Abstract

Graph Neural Networks (GNNs) have demonstrated strong capabilities in processing structured data. While traditional GNNs typically treat each feature dimension equally during graph convolution, we raise an important question: Is the graph convolution operation equally beneficial for each feature? If not, the convolution operation on certain feature dimensions can possibly lead to harmful effects, even worse than the convolution-free models. In prior studies, to assess the impacts of graph convolution on features, people proposed metrics based on feature homophily to measure feature consistency with the graph topology. However, these metrics have shown unsatisfactory alignment with GNN performance and have not been effectively employed to guide feature selection in GNNs. To address these limitations, we introduce a novel metric, Topological Feature Informativeness (TFI), to distinguish between GNN-favored and GNN-disfavored features, where its effectiveness is validated through both theoretical analysis and empirical observations. Based on TFI, we propose a simple yet effective Graph Feature Selection (GFS) method, which processes GNN-favored and GNN-disfavored features separately, using GNNs and non-GNN models. Compared to original GNNs, GFS significantly improves the extraction of useful topological information from each feature with comparable computational costs. Extensive experiments show that after applying GFS to 8 baseline and state-of-the-art (SOTA) GNN architectures across 10 datasets, 83.75% of the GFS-augmented cases show significant performance boosts. Furthermore, our proposed TFI metric outperforms other feature selection methods. These results validate the effectiveness of both GFS and TFI. Additionally, we demonstrate that GFS's improvements are robust to hyperparameter tuning, highlighting its potential as a universal method for enhancing various GNN architectures.

Paper Structure

This paper contains 35 sections, 19 equations, 13 figures, 14 tables.

Figures (13)

  • Figure 1: Improvements in GNN performance at node level and feature level. Different colors denote node labels, while the direction and magnitude of arrows denote node features.
  • Figure 2: The performance gap between GCN and MLP with the increase of TFI.
  • Figure 3: Framework of GFS with TFI.
  • Figure 4: The performance of GCN+GFS is shown as the ratio $r$ of GNN-favored features in TFI increases. The point representing the best performance is highlighted as $\bigstar$.
  • Figure 5: Response of GCN+GFS, GCN, and MLP to $5$ hyperparameters on Computers and Amazon-Ratings.
  • ...and 8 more figures