Table of Contents
Fetching ...

Scalable Signed Exponential Random Graph Models under Local Dependence

Marc Schalberger, Cornelius Fritz

TL;DR

The paper tackles scalable analysis of large signed networks by enforcing local dependence through nonoverlapping blocks, combining within-block signed ERGMs with between-block signed SBMs. It develops a two-step estimation: a variational MM-based SBM approximation to learn block structure, followed by ERGM parameter estimation conditional on blocks, with uncertainty quantification via multiple imputations over blockings. The approach is demonstrated on large synthetic networks and a Wikipedia editor network, uncovering patterns consistent with structural balance and showing superior block recovery and fit for local-dependence models versus global-dependence benchmarks. An open-source R package bigsergm实现s the full pipeline, enabling practical application to thousands of nodes.

Abstract

Traditional network analysis focuses on binary edges, while real-world relationships are more nuanced, encompassing cooperation, neutrality, and conflict. The rise of negative edges in social media discussions spurred interest in analyzing signed interactions, especially in polarized debates. However, the vast data generated by digital networks presents challenges for traditional methods like Stochastic Block Models (SBM) and Exponential Family Random Graph Models (ERGM), particularly due to the homogeneity assumption and global dependence, which become increasingly unrealistic as network size grows. To address this, we propose a novel method that combines the strengths of SBM and ERGM while mitigating their weaknesses by incorporating local dependence based on nonoverlapping blocks. Our approach involves a two-step process: First, decomposing the network into sub-networks using SBM approximation, and, second, estimating parameters using ERGM methods. We validate our method on large synthetic networks and apply it to a signed Wikipedia network of thousands of editors. Through the use of local dependence, we find patterns consistent with structural balance theory.

Scalable Signed Exponential Random Graph Models under Local Dependence

TL;DR

The paper tackles scalable analysis of large signed networks by enforcing local dependence through nonoverlapping blocks, combining within-block signed ERGMs with between-block signed SBMs. It develops a two-step estimation: a variational MM-based SBM approximation to learn block structure, followed by ERGM parameter estimation conditional on blocks, with uncertainty quantification via multiple imputations over blockings. The approach is demonstrated on large synthetic networks and a Wikipedia editor network, uncovering patterns consistent with structural balance and showing superior block recovery and fit for local-dependence models versus global-dependence benchmarks. An open-source R package bigsergm实现s the full pipeline, enabling practical application to thousands of nodes.

Abstract

Traditional network analysis focuses on binary edges, while real-world relationships are more nuanced, encompassing cooperation, neutrality, and conflict. The rise of negative edges in social media discussions spurred interest in analyzing signed interactions, especially in polarized debates. However, the vast data generated by digital networks presents challenges for traditional methods like Stochastic Block Models (SBM) and Exponential Family Random Graph Models (ERGM), particularly due to the homogeneity assumption and global dependence, which become increasingly unrealistic as network size grows. To address this, we propose a novel method that combines the strengths of SBM and ERGM while mitigating their weaknesses by incorporating local dependence based on nonoverlapping blocks. Our approach involves a two-step process: First, decomposing the network into sub-networks using SBM approximation, and, second, estimating parameters using ERGM methods. We validate our method on large synthetic networks and apply it to a signed Wikipedia network of thousands of editors. Through the use of local dependence, we find patterns consistent with structural balance theory.

Paper Structure

This paper contains 27 sections, 63 equations, 18 figures, 5 tables.

Figures (18)

  • Figure 1: Block recovery performance measured by Yule's $\phi$ coefficient. Left: Simulation Study 1 showing recovery accuracy for different numbers of blocks ($K = 25, 50, 75, 100$) and network sizes ($N = 1{,}250$ to $5{,}000$) at $\lambda = 1$. Right: Simulation Study 2 showing the effect of between-block sparsity, controlled by $\lambda$, on block recovery for $K = 25$. Higher values indicate better agreement with the true block structure.
  • Figure 2: Maximum pseudo-likelihood estimates for within-block parameters in Simulation Study 1. Results are based on known block memberships and used to assess parameter recovery accuracy across increasing network sizes.
  • Figure 3: Out-of-sample cross-validation results for the "Full Triad" model including all triadic terms ($\text{Edges}^{+/-}$, $\text{GWD}^{+/-}$, and all GWESP). The distribution of simulated statistics across 100 replications is compared against the observed statistics for each block.
  • Figure 4: Comparison of variational approximation, binary spectral clustering, and spectral clustering using only positive edges for block recovery.
  • Figure 5: Robustness of block recovery to misspecification of $K=25$. We varied the number of blocks around the true value ($K \pm 1$ and $K \pm 5$) to assess sensitivity.
  • ...and 13 more figures

Theorems & Definitions (3)

  • Example 1: Signed SBM
  • Example 2: Signed Model with Triadic Terms
  • Example 3: Signed Model with Structural Terms