Attributed Graph Clustering in Collaborative Settings
Rui Zhang, Xiaoyang Hou, Zhihua Tian, Yan he, Enchao Gong, Jian Liu, Qingbiao Wu, Kui Ren
TL;DR
This work tackles unsupervised graph clustering when node attributes are partitioned across collaborators (vertical setting) and data privacy must be preserved. It introduces kCAGC, a graph-filtering–based framework that reduces communication by leveraging local clustering intersections to form a small set of virtual nodes, enabling secure aggregation to produce globally coherent clusters. The authors provide a theoretical proximity-based correctness guarantee under a restricted proximity condition and demonstrate that kCAGC can achieve accuracy comparable to centralized methods on four public datasets, while considerably reducing communication costs. Empirical results show favorable utility and practical efficiency both in LAN and WAN scenarios, with a comprehensive security analysis showing low leakage under honest-but-curious models. The approach offers a principled, privacy-preserving solution for collaborative graph clustering in vertically partitioned settings with scalable communication and robust performance.
Abstract
Graph clustering is an unsupervised machine learning method that partitions the nodes in a graph into different groups. Despite achieving significant progress in exploiting both attributed and structured data information, graph clustering methods often face practical challenges related to data isolation. Moreover, the absence of collaborative methods for graph clustering limits their effectiveness. In this paper, we propose a collaborative graph clustering framework for attributed graphs, supporting attributed graph clustering over vertically partitioned data with different participants holding distinct features of the same data. Our method leverages a novel technique that reduces the sample space, improving the efficiency of the attributed graph clustering method. Furthermore, we compare our method to its centralized counterpart under a proximity condition, demonstrating that the successful local results of each participant contribute to the overall success of the collaboration. We fully implement our approach and evaluate its utility and efficiency by conducting experiments on four public datasets. The results demonstrate that our method achieves comparable accuracy levels to centralized attributed graph clustering methods. Our collaborative graph clustering framework provides an efficient and effective solution for graph clustering challenges related to data isolation.
