Community Concealment from Unsupervised Graph Learning-Based Clustering
Dalyapraz Manatova, Pablo Moriano, L. Jean Camp
TL;DR
This work studies group-level privacy risks arising from GNN-based unsupervised community detection and proposes a defense against targeted community inference. It identifies two key factors that govern hidability: the inter/intra-edge ratio of the target community and the feature-space proximity to neighboring communities, and uses these insights to motivate a feature-guided defense. The authors introduce FCom-DICE, which extends the structural perturbations of DICE by adding feature-aware edge attachments and node-feature adjustments to disrupt GNN message passing. Across synthetic featurized LFR graphs and real networks, FCom-DICE yields median improvements of roughly $20 ext{-}45\n%$ over DICE under identical budgets while preserving the overall community structure, underscoring the importance of jointly considering topology and attributes in privacy-aware graph learning.
Abstract
Graph neural networks (GNNs) are designed to use attributed graphs to learn representations. Such representations are beneficial in the unsupervised learning of clusters and community detection. Nonetheless, such inference may reveal sensitive groups, clustered systems, or collective behaviors, raising concerns regarding group-level privacy. Community attribution in social and critical infrastructure networks, for example, can expose coordinated asset groups, operational hierarchies, and system dependencies that could be used for profiling or intelligence gathering. We study a defensive setting in which a data publisher (defender) seeks to conceal a community of interest while making limited, utility-aware changes in the network. Our analysis indicates that community concealment is strongly influenced by two quantifiable factors: connectivity at the community boundary and feature similarity between the protected community and adjacent communities. Informed by these findings, we present a perturbation strategy that rewires a set of selected edges and modifies node features to reduce the distinctiveness leveraged by GNN message passing. The proposed method outperforms DICE in our experiments on synthetic benchmarks and real network graphs under identical perturbation budgets. Overall, it achieves median relative concealment improvements of approximately 20-45% across the evaluated settings. These findings demonstrate a mitigation strategy against GNN-based community learning and highlight group-level privacy risks intrinsic to graph learning.
