On the Power of Graph Neural Networks and Feature Augmentation Strategies to Classify Social Networks
Walid Guettala, László Gulyás
TL;DR
This work investigates graph classification on synthetic social networks by evaluating four GNN architectures (GCN with Hierarchical and Global pooling, GIN, GATv2) across five node-feature augmentation strategies (Ones, Noise, Degree, Norm Degree, Identity). All models operate on featureless graphs augmented with these artificial features and are trained with a shared classifier head, across a grid of hidden dimensions $H$, to distinguish eight network families generated by classic Network Science models. The study finds that high-capacity architectures (GIN, GATv2) generally perform well across augmentations, while informative features like Identity and Degree provide the strongest gains and can compensate for lower model complexity; Hierarchical pooling underperforms relative to other configurations. The results highlight a balance between architectural power and feature informativeness, with implications for graph classification on real networks after training on diverse synthetic benchmarks.
Abstract
This paper studies four Graph Neural Network architectures (GNNs) for a graph classification task on a synthetic dataset created using classic generative models of Network Science. Since the synthetic networks do not contain (node or edge) features, five different augmentation strategies (artificial feature types) are applied to nodes. All combinations of the 4 GNNs (GCN with Hierarchical and Global aggregation, GIN and GATv2) and the 5 feature types (constant 1, noise, degree, normalized degree and ID -- a vector of the number of cycles of various lengths) are studied and their performances compared as a function of the hidden dimension of artificial neural networks used in the GNNs. The generalisation ability of these models is also analysed using a second synthetic network dataset (containing networks of different sizes).Our results point towards the balanced importance of the computational power of the GNN architecture and the the information level provided by the artificial features. GNN architectures with higher computational power, like GIN and GATv2, perform well for most augmentation strategies. On the other hand, artificial features with higher information content, like ID or degree, not only consistently outperform other augmentation strategies, but can also help GNN architectures with lower computational power to achieve good performance.
