Table of Contents
Fetching ...

On the accurate computation of expected modularity in probabilistic networks

Xin Shen, Matteo Magnani, Christian Rohner, Fiona Skerman

TL;DR

The results demonstrate that removing low-probability edges or treating probabilities as weights produces inaccurate results, while the convergence of the sampling method varies with the parameters of the network.

Abstract

Modularity is one of the most widely used measures for evaluating communities in networks. In probabilistic networks, where the existence of edges is uncertain and uncertainty is represented by probabilities, the expected value of modularity can be used instead. However, efficiently computing expected modularity is challenging. To address this challenge, we propose a novel and efficient technique (FPWP) for computing the probability distribution of modularity and its expected value. In this paper, we implement and compare our method and various general approaches for expected modularity computation in probabilistic networks. These include: (1) translating probabilistic networks into deterministic ones by removing low-probability edges or treating probabilities as weights, (2) using Monte Carlo sampling to approximate expected modularity, and (3) brute-force computation. We evaluate the accuracy and time efficiency of FPWP through comprehensive experiments on both real-world and synthetic networks with diverse characteristics. Our results demonstrate that removing low-probability edges or treating probabilities as weights produces inaccurate results, while the convergence of the sampling method varies with the parameters of the network. Brute-force computation, though accurate, is prohibitively slow. In contrast, our method is much faster than brute-force computation, but guarantees an accurate result.

On the accurate computation of expected modularity in probabilistic networks

TL;DR

The results demonstrate that removing low-probability edges or treating probabilities as weights produces inaccurate results, while the convergence of the sampling method varies with the parameters of the network.

Abstract

Modularity is one of the most widely used measures for evaluating communities in networks. In probabilistic networks, where the existence of edges is uncertain and uncertainty is represented by probabilities, the expected value of modularity can be used instead. However, efficiently computing expected modularity is challenging. To address this challenge, we propose a novel and efficient technique (FPWP) for computing the probability distribution of modularity and its expected value. In this paper, we implement and compare our method and various general approaches for expected modularity computation in probabilistic networks. These include: (1) translating probabilistic networks into deterministic ones by removing low-probability edges or treating probabilities as weights, (2) using Monte Carlo sampling to approximate expected modularity, and (3) brute-force computation. We evaluate the accuracy and time efficiency of FPWP through comprehensive experiments on both real-world and synthetic networks with diverse characteristics. Our results demonstrate that removing low-probability edges or treating probabilities as weights produces inaccurate results, while the convergence of the sampling method varies with the parameters of the network. Brute-force computation, though accurate, is prohibitively slow. In contrast, our method is much faster than brute-force computation, but guarantees an accurate result.
Paper Structure (25 sections, 19 equations, 14 figures, 5 tables, 2 algorithms)

This paper contains 25 sections, 19 equations, 14 figures, 5 tables, 2 algorithms.

Figures (14)

  • Figure 1: Edge partitions used in the community-based definition of modularity, for a community $c_i$.
  • Figure 2: A probabilistic network with a community $c$.
  • Figure 3: A tree enumerating all partitions of possible worlds defined by community $c$.
  • Figure 4: All possible worlds in partition 7. $|e_c^w|, |e_{c,\Bar{c}}^w|$, and $|e_{\Bar{c}}^w|$ are constant inside the partition.
  • Figure 5: Running time of brute-force, $\mathrm{PWP}$, and $\mathrm{FPWP}$ , with log axes.
  • ...and 9 more figures