Table of Contents
Fetching ...

Revisiting Dynamic Graph Clustering via Matrix Factorization

Dongyuan Li, Satoshi Kosugi, Ying Zhang, Manabu Okumura, Feng Xia, Renhe Jiang

TL;DR

Bi-clustering regularization is introduced, which jointly optimizes graph embedding and clustering, thereby filtering out noisy features from the graph embeddings, and temporal separated matrix factorization is proposed, which results in faster computation.

Abstract

Dynamic graph clustering aims to detect and track time-varying clusters in dynamic graphs, revealing the evolutionary mechanisms of complex real-world dynamic systems. Matrix factorization-based methods are promising approaches for this task; however, these methods often struggle with scalability and can be time-consuming when applied to large-scale dynamic graphs. Moreover, they tend to lack robustness and are vulnerable to real-world noisy data. To address these issues, we make three key contributions. First, to improve scalability, we propose temporal separated matrix factorization, where a single matrix is divided into multiple smaller matrices for independent factorization, resulting in faster computation. Second, to improve robustness, we introduce bi-clustering regularization, which jointly optimizes graph embedding and clustering, thereby filtering out noisy features from the graph embeddings. Third, to further enhance effectiveness and efficiency, we propose selective embedding updating, where we update only the embeddings of dynamic nodes while the embeddings of static nodes are fixed among different timestamps. Experimental results on six synthetic and five real-world benchmarks demonstrate the scalability, robustness and effectiveness of our proposed method. Source code is available at https://github.com/Clearloveyuan/DyG-MF.

Revisiting Dynamic Graph Clustering via Matrix Factorization

TL;DR

Bi-clustering regularization is introduced, which jointly optimizes graph embedding and clustering, thereby filtering out noisy features from the graph embeddings, and temporal separated matrix factorization is proposed, which results in faster computation.

Abstract

Dynamic graph clustering aims to detect and track time-varying clusters in dynamic graphs, revealing the evolutionary mechanisms of complex real-world dynamic systems. Matrix factorization-based methods are promising approaches for this task; however, these methods often struggle with scalability and can be time-consuming when applied to large-scale dynamic graphs. Moreover, they tend to lack robustness and are vulnerable to real-world noisy data. To address these issues, we make three key contributions. First, to improve scalability, we propose temporal separated matrix factorization, where a single matrix is divided into multiple smaller matrices for independent factorization, resulting in faster computation. Second, to improve robustness, we introduce bi-clustering regularization, which jointly optimizes graph embedding and clustering, thereby filtering out noisy features from the graph embeddings. Third, to further enhance effectiveness and efficiency, we propose selective embedding updating, where we update only the embeddings of dynamic nodes while the embeddings of static nodes are fixed among different timestamps. Experimental results on six synthetic and five real-world benchmarks demonstrate the scalability, robustness and effectiveness of our proposed method. Source code is available at https://github.com/Clearloveyuan/DyG-MF.

Paper Structure

This paper contains 20 sections, 3 theorems, 18 equations, 8 figures, 4 tables, 1 algorithm.

Key Result

Theorem 1

For $\forall i$ satisfying $1 \leq i \leq s$, assuming $C_{t}^{i}$ and $H_{t}^{i}$ in Eq.(equ8) can be linearly represented by the basis and coefficient matrices of the landmarks, i.e., $C^{i}_{t}=P^{i}_{t}\Phi_{t}$ and $H^{i}_{t}=\Psi_{t} Q^{i}_{t}$. Then, jointly considering the matrix factorizati

Figures (8)

  • Figure 1: Running time and performance on noisy data of our proposed method and three matrix factorization-based methods, i.e., DYNMOGA Folina2013, jLMDC 9531337, and RDMA DBLP:journals/www/RanjkeshMH24.
  • Figure 2: Overview architecture of proposed DyG-MF. Our method (a) first selects temporal landmarks and (b) randomly divides nodes into several groups for (c) separated matrix factorization ((a)-(c) introduced in Sec \ref{['main_TSMF']}). In addition, we apply (d) bi-clustering regularization (Sec \ref{['bi-clustering-module']}) and (e) selective embedding updating (Sec \ref{['topological_dynamics']}) to dynamic graph clustering.
  • Figure 3: Performance on varying timestamps of selected best-performing baselines on four real-world datasets.
  • Figure 4: Modularity and Density on large dynamic graphs.
  • Figure 5: Scalability w.r.t. varying snapshots and nodes.
  • ...and 3 more figures

Theorems & Definitions (3)

  • Theorem 1
  • Theorem 2
  • Theorem 3