Table of Contents
Fetching ...

Predicting Properties of Nodes via Community-Aware Features

Bogumił Kamiński, Paweł Prałat, François Théberge, Sebastian Zając

TL;DR

This work shows that incorporating community structure into node features yields information not captured by classical features or standard embeddings, improving node classification in many settings. By defining a family of community-aware features—including anomaly-based, distribution-based, and a novel beta^* measure—and grounding them in null-models and modularity theory, the authors demonstrate non-redundant predictive power and practical scalability. Across synthetic ABCD+o graphs and diverse real-world networks, these features often outperform traditional metrics, while remaining interpretable and efficient to compute; embeddings may still dominate in some empirical cases, but the community-aware set provides valuable alternatives, especially for large graphs where embeddings are costly. Collectively, the paper argues for including community-aware features in predictive pipelines to leverage network structure while maintaining interpretability.

Abstract

This paper shows how information about the network's community structure can be used to define node features with high predictive power for classification tasks. To do so, we define a family of community-aware node features and investigate their properties. Those features are designed to ensure that they can be efficiently computed even for large graphs. We show that community-aware node features contain information that cannot be completely recovered by classical node features or node embeddings (both classical and structural) and bring value in node classification tasks. This is verified for various classification tasks on synthetic and real-life networks.

Predicting Properties of Nodes via Community-Aware Features

TL;DR

This work shows that incorporating community structure into node features yields information not captured by classical features or standard embeddings, improving node classification in many settings. By defining a family of community-aware features—including anomaly-based, distribution-based, and a novel beta^* measure—and grounding them in null-models and modularity theory, the authors demonstrate non-redundant predictive power and practical scalability. Across synthetic ABCD+o graphs and diverse real-world networks, these features often outperform traditional metrics, while remaining interpretable and efficient to compute; embeddings may still dominate in some empirical cases, but the community-aware set provides valuable alternatives, especially for large graphs where embeddings are costly. Collectively, the paper argues for including community-aware features in predictive pipelines to leverage network structure while maintaining interpretability.

Abstract

This paper shows how information about the network's community structure can be used to define node features with high predictive power for classification tasks. To do so, we define a family of community-aware node features and investigate their properties. Those features are designed to ensure that they can be efficiently computed even for large graphs. We show that community-aware node features contain information that cannot be completely recovered by classical node features or node embeddings (both classical and structural) and bring value in node classification tasks. This is verified for various classification tasks on synthetic and real-life networks.
Paper Structure (19 sections, 15 equations, 3 figures, 7 tables)

This paper contains 19 sections, 15 equations, 3 figures, 7 tables.

Figures (3)

  • Figure 1: Communities (red and green colours) in the Karate graph. The shades of nodes correspond to their values of $\beta^*(v)$ (darker colours indicate lower values).
  • Figure 2: Results of one-way predictive power assessment of considered node features for ABCD+o graphs
  • Figure 3: Results of one-way predictive power assessment of considered node features for empirical graphs