Predicting Properties of Nodes via Community-Aware Features
Bogumił Kamiński, Paweł Prałat, François Théberge, Sebastian Zając
TL;DR
This work shows that incorporating community structure into node features yields information not captured by classical features or standard embeddings, improving node classification in many settings. By defining a family of community-aware features—including anomaly-based, distribution-based, and a novel beta^* measure—and grounding them in null-models and modularity theory, the authors demonstrate non-redundant predictive power and practical scalability. Across synthetic ABCD+o graphs and diverse real-world networks, these features often outperform traditional metrics, while remaining interpretable and efficient to compute; embeddings may still dominate in some empirical cases, but the community-aware set provides valuable alternatives, especially for large graphs where embeddings are costly. Collectively, the paper argues for including community-aware features in predictive pipelines to leverage network structure while maintaining interpretability.
Abstract
This paper shows how information about the network's community structure can be used to define node features with high predictive power for classification tasks. To do so, we define a family of community-aware node features and investigate their properties. Those features are designed to ensure that they can be efficiently computed even for large graphs. We show that community-aware node features contain information that cannot be completely recovered by classical node features or node embeddings (both classical and structural) and bring value in node classification tasks. This is verified for various classification tasks on synthetic and real-life networks.
