Table of Contents
Fetching ...

Balanced Multi-Relational Graph Clustering

Zhixiang Shen, Haolan He, Zhao Kang

TL;DR

This work tackles view imbalance in multi-relational graphs by introducing Aggregation Class Distance (ACD) to quantify cross-view structural disparities and Balanced Multi-Relational Graph Clustering (BMGC) to learn representations that are guided by an unsupervised dominant view. BMGC jointly performs scalable per-view encoding, unsupervised dominant view mining, and dual-signal representation learning to align views with the dominant one and enhance clustering via a dominant-assignment mechanism. Theoretical analysis links dominant-view mining to ACD and demonstrates its effectiveness, while extensive experiments on real, synthetic, and large-scale datasets show state-of-the-art clustering performance and robustness to noisy views. The approach offers a scalable, unsupervised solution for robust clustering in imbalanced multi-relational graphs with practical impact for complex networks.

Abstract

Multi-relational graph clustering has demonstrated remarkable success in uncovering underlying patterns in complex networks. Representative methods manage to align different views motivated by advances in contrastive learning. Our empirical study finds the pervasive presence of imbalance in real-world graphs, which is in principle contradictory to the motivation of alignment. In this paper, we first propose a novel metric, the Aggregation Class Distance, to empirically quantify structural disparities among different graphs. To address the challenge of view imbalance, we propose Balanced Multi-Relational Graph Clustering (BMGC), comprising unsupervised dominant view mining and dual signals guided representation learning. It dynamically mines the dominant view throughout the training process, synergistically improving clustering performance with representation learning. Theoretical analysis ensures the effectiveness of dominant view mining. Extensive experiments and in-depth analysis on real-world and synthetic datasets showcase that BMGC achieves state-of-the-art performance, underscoring its superiority in addressing the view imbalance inherent in multi-relational graphs. The source code and datasets are available at https://github.com/zxlearningdeep/BMGC.

Balanced Multi-Relational Graph Clustering

TL;DR

This work tackles view imbalance in multi-relational graphs by introducing Aggregation Class Distance (ACD) to quantify cross-view structural disparities and Balanced Multi-Relational Graph Clustering (BMGC) to learn representations that are guided by an unsupervised dominant view. BMGC jointly performs scalable per-view encoding, unsupervised dominant view mining, and dual-signal representation learning to align views with the dominant one and enhance clustering via a dominant-assignment mechanism. Theoretical analysis links dominant-view mining to ACD and demonstrates its effectiveness, while extensive experiments on real, synthetic, and large-scale datasets show state-of-the-art clustering performance and robustness to noisy views. The approach offers a scalable, unsupervised solution for robust clustering in imbalanced multi-relational graphs with practical impact for complex networks.

Abstract

Multi-relational graph clustering has demonstrated remarkable success in uncovering underlying patterns in complex networks. Representative methods manage to align different views motivated by advances in contrastive learning. Our empirical study finds the pervasive presence of imbalance in real-world graphs, which is in principle contradictory to the motivation of alignment. In this paper, we first propose a novel metric, the Aggregation Class Distance, to empirically quantify structural disparities among different graphs. To address the challenge of view imbalance, we propose Balanced Multi-Relational Graph Clustering (BMGC), comprising unsupervised dominant view mining and dual signals guided representation learning. It dynamically mines the dominant view throughout the training process, synergistically improving clustering performance with representation learning. Theoretical analysis ensures the effectiveness of dominant view mining. Extensive experiments and in-depth analysis on real-world and synthetic datasets showcase that BMGC achieves state-of-the-art performance, underscoring its superiority in addressing the view imbalance inherent in multi-relational graphs. The source code and datasets are available at https://github.com/zxlearningdeep/BMGC.
Paper Structure (27 sections, 4 theorems, 19 equations, 4 figures, 6 tables, 1 algorithm)

This paper contains 27 sections, 4 theorems, 19 equations, 4 figures, 6 tables, 1 algorithm.

Key Result

Lemma 1

Let $X^v$ be the aggregated feature matrix of the $v$-th view by applying SGC, with the number of hops $K$, to the expected adjacency matrix $\tilde{A}^v$ and the feature matrix $X$. Then, $X^v=F^v+c_1(\theta_1^v)^\top+c_2(\theta_2^v)^\top$, where $F^v=(\lambda_2^v)^KF+(1-(\lambda_2^v)^K)(\mathbf{1}

Figures (4)

  • Figure 1: The node classification accuracy for each view of the test set, along with the corresponding ACD. The trends in accuracy and ACD are consistent.
  • Figure 2: Illustration of our proposed framework BMGC. Firstly, node representations for each view are obtained through scalable graph encoding. Then, unsupervised dominant view mining and dual signals guided representation learning synergistically facilitate model training. Finally, the dominant assignment further enhances clustering quality.
  • Figure 3: Case study on synthetic (left) and ACM (right) datasets. The specific meanings of the "Metric" in each figure can be found in Section \ref{['case_study']}.
  • Figure 4: The influence of $\alpha$ (left) and $K$ (right).

Theorems & Definitions (5)

  • Definition 1
  • Lemma 1
  • Theorem 1
  • Lemma 1
  • Theorem 1