Adaptive Node Feature Selection For Graph Neural Networks
Ali Azizpour, Madeline Navarro, Santiago Segarra
TL;DR
This work tackles interpretability and efficiency in graph neural networks by introducing permutation-based node feature importance (NPT) and an adaptive feature selection algorithm that prunes uninformative features during training. The method theoretically links feature and graph structure to GCN performance, and empirically demonstrates that NPT provides meaningful importance scores across diverse graph datasets with varying homophily, while ANFS maintains or closely matches full-feature accuracy with fewer attributes. The approach supports model- and task-agnostic applicability, enables dynamic monitoring of feature relevance, and offers practical benefits for reducing dimensionality without sacrificing predictive power. Overall, it advances explainability and computational efficiency in GNNs, with potential extensions to other graph tasks and future work on more robust permutation schemes.
Abstract
We propose an adaptive node feature selection approach for graph neural networks (GNNs) that identifies and removes unnecessary features during training. The ability to measure how features contribute to model output is key for interpreting decisions, reducing dimensionality, and even improving performance by eliminating unhelpful variables. However, graph-structured data introduces complex dependencies that may not be amenable to classical feature importance metrics. Inspired by this challenge, we present a model- and task-agnostic method that determines relevant features during training based on changes in validation performance upon permuting feature values. We theoretically motivate our intervention-based approach by characterizing how GNN performance depends on the relationships between node data and graph structure. Not only do we return feature importance scores once training concludes, we also track how relevance evolves as features are successively dropped. We can therefore monitor if features are eliminated effectively and also evaluate other metrics with this technique. Our empirical results verify the flexibility of our approach to different graph architectures as well as its adaptability to more challenging graph learning settings.
