Table of Contents
Fetching ...

Overlapping community detection in weighted networks

Huan Qing

TL;DR

This work addresses overlapping community detection in weighted networks by introducing the Weighted Degree-Corrected Mixed Membership (WDCMM) model, which extends the DCMM framework to arbitrary edge-weight distributions while preserving a low-rank, block-structured mean matrix $\Omega=\Theta\Pi P\Pi'\Theta$. It develops a distribution-free spectral estimator (ScD) that leverages an Ideal Cone structure to recover the mixed-membership matrix $\Pi$ with provable consistency bounds that hold under mild, distribution-agnostic conditions. To determine the number of communities, the authors define the overlapping weighted modularity $Q_{ovw}$ and propose a modularity-driven approach (KScD) that works for both assortative and disassortative, as well as signed, weighted networks. Extensive simulations and real-data experiments show that ScD competes effectively with existing methods across weight regimes, and that $Q_{ovw}$ provides reliable guidance for selecting $K$. The results advance practical tools for analyzing complex weighted networks with overlapping structure and motivate future extensions to Mixed-SCORE, directed networks, and faster scalable algorithms.

Abstract

Over the past decade, community detection in overlapping un-weighted networks, where nodes can belong to multiple communities, has been one of the most popular topics in modern network science. However, community detection in overlapping weighted networks, where edge weights can be any real value, remains challenging. In this article, we propose a generative model called the weighted degree-corrected mixed membership (WDCMM) model to model such weighted networks. This model adopts the same factorization for the expectation of the adjacency matrix as the previous degree-corrected mixed membership (DCMM) model. Our WDCMM extends the DCMM from un-weighted networks to weighted networks by allowing the elements of the adjacency matrix to be generated from distributions beyond Bernoulli. We first address the community membership estimation of the model by applying a spectral algorithm and establishing a theoretical guarantee of consistency. Then, we propose overlapping weighted modularity to measure the quality of overlapping community detection for both assortative and dis-assortative weighted networks. To determine the number of communities, we incorporate the algorithm into the proposed modularity. We demonstrate the advantages of the model and the modularity through applications to simulated data and real-world networks.

Overlapping community detection in weighted networks

TL;DR

This work addresses overlapping community detection in weighted networks by introducing the Weighted Degree-Corrected Mixed Membership (WDCMM) model, which extends the DCMM framework to arbitrary edge-weight distributions while preserving a low-rank, block-structured mean matrix . It develops a distribution-free spectral estimator (ScD) that leverages an Ideal Cone structure to recover the mixed-membership matrix with provable consistency bounds that hold under mild, distribution-agnostic conditions. To determine the number of communities, the authors define the overlapping weighted modularity and propose a modularity-driven approach (KScD) that works for both assortative and disassortative, as well as signed, weighted networks. Extensive simulations and real-data experiments show that ScD competes effectively with existing methods across weight regimes, and that provides reliable guidance for selecting . The results advance practical tools for analyzing complex weighted networks with overlapping structure and motivate future extensions to Mixed-SCORE, directed networks, and faster scalable algorithms.

Abstract

Over the past decade, community detection in overlapping un-weighted networks, where nodes can belong to multiple communities, has been one of the most popular topics in modern network science. However, community detection in overlapping weighted networks, where edge weights can be any real value, remains challenging. In this article, we propose a generative model called the weighted degree-corrected mixed membership (WDCMM) model to model such weighted networks. This model adopts the same factorization for the expectation of the adjacency matrix as the previous degree-corrected mixed membership (DCMM) model. Our WDCMM extends the DCMM from un-weighted networks to weighted networks by allowing the elements of the adjacency matrix to be generated from distributions beyond Bernoulli. We first address the community membership estimation of the model by applying a spectral algorithm and establishing a theoretical guarantee of consistency. Then, we propose overlapping weighted modularity to measure the quality of overlapping community detection for both assortative and dis-assortative weighted networks. To determine the number of communities, we incorporate the algorithm into the proposed modularity. We demonstrate the advantages of the model and the modularity through applications to simulated data and real-world networks.
Paper Structure (13 sections, 4 theorems, 40 equations, 9 figures, 6 tables, 1 algorithm)

This paper contains 13 sections, 4 theorems, 40 equations, 9 figures, 6 tables, 1 algorithm.

Key Result

Theorem 1

Under $WDCMM_{n}(K,P,\Pi,\Theta,\mathcal{F})$, let $\hat{\Pi}$ be obtained from Algorithm alg:ScD, when Assumption assumeTau, Assumption assumesparsity, and Condition condition hold, there exists a permutation matrix $\mathcal{P}\in\mathbb{R}^{K\times K}$ such that with probability at least $1-o(n^{

Figures (9)

  • Figure 1: Illustrative examples of three real-world weighted networks. For the Slovene Parliamentary Party network, edge weight means political space distance between parties. For the Gahuku-Gama subtribes network, edge weight represents friendship. For the Karate-club-weighted network, edge weight indicates the relative strength of the associations. For visualization, we do not show node labels.
  • Figure 2: Illustration of the Ideal Cone structure embedded within $U_{*}$ for the case when $K=3$. Here, dots represent the rows of $U_{*}$, while the hyperplane is constituted by the three rows of $U_{*}(\mathcal{I},:)$. Notably, all mixed rows of $U_{*}$ are located on one side of this hyperplane, where we call $U_{*}(i,:)$ a mixed row if the corresponding node $i$ is a mixed node and a pure row otherwise. In this graphical depiction, for every mixed node $i$, we assign its mixed membership as $\Pi(i,1)=r_{1}$, $\Pi(i,2)=r_{2}$, and $\Pi(i,3)=1-r_{1}-r_{2}$, where $r_{1}$ and $r_{2}$ are determined by $\frac{\mathrm{rand}(1)}{2}$, with $\mathrm{rand}(1)$ being a random value drawn from the $\mathrm{Uniform}(0,1)$ distribution. For visualization, these points have been projected and rotated from $\mathbb{R}^{3}$ into $\mathbb{R}^{2}$.
  • Figure 3: Numerical results of Experiment 1.
  • Figure 4: Numerical results of Experiment 2.
  • Figure 5: Numerical results of Experiment 3.
  • ...and 4 more figures

Theorems & Definitions (25)

  • Definition 1
  • Remark 1
  • Remark 2
  • Remark 3
  • Remark 4
  • Remark 5
  • Remark 6
  • Theorem 1
  • Example 1
  • Example 2
  • ...and 15 more