Gradient flow-based modularity maximization for community detection in multiplex networks

Kai Bergermann; Martin Stoll

Gradient flow-based modularity maximization for community detection in multiplex networks

Kai Bergermann, Martin Stoll

TL;DR

This work addresses scalable community detection in multiplex networks by introducing two gradient-flow formulations that maximize multiplex modularity. MPBTV reformulates modularity as a balanced multiplex total variation minimization, while DGFM3 directly maximizes modularity via a matrix-valued gradient flow; both are solved efficiently with a graph MBO scheme and spectral truncation. Empirical results on real networks and image data show competitive accuracy and NMI, with runtimes orders of magnitude faster than competing methods for large multiplexes. The methods reduce computational complexity while preserving partition quality, offering practical scalability for complex multilayer systems characterized by intra- and inter-layer structure.

Abstract

We propose two methods for the unsupervised detection of communities in undirected multiplex networks. These networks consist of multiple layers that record different relationships between the same entities or incorporate data from different sources. Both methods are formulated as gradient flows of suitable energy functionals: the first (MPBTV) builds on the minimization of a balanced total variation functional, which we show to be equivalent to multiplex modularity maximization, while the second (DGFM3) directly maximizes multiplex modularity. The resulting non-linear matrix-valued ordinary differential equations (ODEs) are solved efficiently by a graph Merriman--Bence--Osher (MBO) scheme. Key to the efficiency is the approximate integration of the discrete linear differential operators by truncated eigendecompositions in the matrix exponential function. Numerical experiments on several real-world multiplex networks show that our methods are competitive with the state of the art with respect to various metrics. Their major benefit is a significant reduction of computational complexity leading to runtimes that are orders of magnitude faster for large multiplex networks.

Gradient flow-based modularity maximization for community detection in multiplex networks

TL;DR

Abstract

Paper Structure (14 sections, 25 equations, 3 figures, 8 tables, 1 algorithm)

This paper contains 14 sections, 25 equations, 3 figures, 8 tables, 1 algorithm.

Introduction
Background
Contribution
Outline
Multiplex networks
Community detection by modularity maximization
Multiplex balanced total variation minimization
Direct multiplex modularity maximization
Numerical solution of the ODEs
Numerical experiments
Small real-world multiplex networks
Ground truth multiplex networks
Image data
Conclusion and Outlook

Figures (3)

Figure 1: Test image with $293 \times 520$ pixels and corresponding ground truth labels. Green label colors represent "tree", yellow "beach", dark blue "sea", light blue "sky", and gray "swimming object" such as boats or humans.
Figure 2: MPBTV community detection result for the two layers of the multiplex network corresponding to the image from \ref{['fig:beach_image']}. The corresponding multiplex modularity is $0.973$, the classification accuracy $0.976$, and the NMI score $0.909$.
Figure 3: Runtimes of the three methods implemented in Matlab in seconds. For GenLouvain, "matrix" denotes the version that explicitly assembles the full multiplex modularity matrix $\bm{M}$ while "fun. handle" denotes the function handle version that provides access to individual columns of $\bm{M}$. Offline runtimes of MPBTV and DGFM3 correspond to eigenvalue computations. Offline runtimes of GenLouvain matrix correspond to the assembling of the multiplex modularity matrix and that of GenLouvain fun. handle to the set-up of the function handle. The assembling of the multiplex modularity matrix in GenLouvain matrix exceeds the available $16$GB of memory for $nL>19\,240$. Runtimes "per run" of MPBTV and DGFM3 correspond to the solution of \ref{['eq:ODE']} and \ref{['eq:ODE_dir']}, respectively, for one initial condition while for GenLouvain they refer to the execution of the method for one randomized node-layer pair ordering. All runtimes are averaged over $10$ independent code executions (with the exception of the "per run" runtime of GenLouvain fun. handle for $nL=304\,720$, which is only averaged over $5$ independent code executions for time reasons).

Theorems & Definitions (3)

proof
proof
proof

Gradient flow-based modularity maximization for community detection in multiplex networks

TL;DR

Abstract

Gradient flow-based modularity maximization for community detection in multiplex networks

Authors

TL;DR

Abstract

Table of Contents

Figures (3)

Theorems & Definitions (3)