Balanced Multi-Relational Graph Clustering
Zhixiang Shen, Haolan He, Zhao Kang
TL;DR
This work tackles view imbalance in multi-relational graphs by introducing Aggregation Class Distance (ACD) to quantify cross-view structural disparities and Balanced Multi-Relational Graph Clustering (BMGC) to learn representations that are guided by an unsupervised dominant view. BMGC jointly performs scalable per-view encoding, unsupervised dominant view mining, and dual-signal representation learning to align views with the dominant one and enhance clustering via a dominant-assignment mechanism. Theoretical analysis links dominant-view mining to ACD and demonstrates its effectiveness, while extensive experiments on real, synthetic, and large-scale datasets show state-of-the-art clustering performance and robustness to noisy views. The approach offers a scalable, unsupervised solution for robust clustering in imbalanced multi-relational graphs with practical impact for complex networks.
Abstract
Multi-relational graph clustering has demonstrated remarkable success in uncovering underlying patterns in complex networks. Representative methods manage to align different views motivated by advances in contrastive learning. Our empirical study finds the pervasive presence of imbalance in real-world graphs, which is in principle contradictory to the motivation of alignment. In this paper, we first propose a novel metric, the Aggregation Class Distance, to empirically quantify structural disparities among different graphs. To address the challenge of view imbalance, we propose Balanced Multi-Relational Graph Clustering (BMGC), comprising unsupervised dominant view mining and dual signals guided representation learning. It dynamically mines the dominant view throughout the training process, synergistically improving clustering performance with representation learning. Theoretical analysis ensures the effectiveness of dominant view mining. Extensive experiments and in-depth analysis on real-world and synthetic datasets showcase that BMGC achieves state-of-the-art performance, underscoring its superiority in addressing the view imbalance inherent in multi-relational graphs. The source code and datasets are available at https://github.com/zxlearningdeep/BMGC.
