Scaling Frustration Index and Corresponding Balanced State Discovery for Real Signed Graphs
Muhieddine Shebaro, Jelena Tešić
TL;DR
This work tackles the NP-hard problem of computing the frustration index in large signed graphs by introducing two scalable methods: graphBpp, a tree-based algorithm that builds a memory-bound frustration cloud of nearest balanced states to approximate the frustration index, and graphL, a gradient-descent approach that optimizes a continuous relaxation of vertex signs to minimize imbalance. graphBpp leverages multiple spanning-tree samplers and stores only the best near-balanced states, achieving substantial speedups (≈300x) over state-of-the-art on graphs with millions of edges, while graphL offers linear-time gradient updates and typically yields more optimal states at the cost of hyperparameter tuning and single-state output. The authors provide extensive empirical comparisons on large real-world benchmarks (e.g., Konect, Amazon), demonstrating the scalability of graphBpp and the efficiency of graphL, including a memory-limited scalable variant that can process graphs with ~$10^7$ vertices and ~$2.2\times 10^7$ edges. Overall, the paper advances practical balanced-state discovery in real-world large-scale signed networks and offers trade-offs between breadth (multiple near-balanced states) and depth (a single optimized state).
Abstract
Structural balance modeling for signed graph networks presents how to model the sources of conflicts. The state-of-the-art focuses on computing the frustration index of a signed graph, a critical step toward solving problems in social and sensor networks and scientific modeling. The proposed approaches do not scale to large signed networks of tens of millions of vertices and edges. This paper proposes two efficient algorithms, a tree-based \emph{graphBpp} and a gradient descent-based \emph{graphL}. We show that both algorithms outperform state-of-art in terms of efficiency and effectiveness for discovering the balanced state for \emph{any} network size. We introduce the first comparison for large graphs for the exact, tree-based, and gradient descent-based methods. The speedup of the methods is around \emph{300+ times faster} than the state-of-the-art for large signed graphs. We find that the exact method excels at optimally finding the frustration for small graphs only. \emph{graphBpp} scales this approximation to large signed graphs at the cost of accuracy. \emph{graphL} produces a state with a lower frustration at the cost of selecting a proper variable initialization and hyperparameter tuning.
