Table of Contents
Fetching ...

Dense Subgraph Discovery Meets Strong Triadic Closure

Chamalee Wickrama Arachchi, Iiro Kumpulainen, Nikolaj Tatti

TL;DR

The paper addresses dense subgraph discovery under the strong triadic closure (STC) constraint by labeling edges as strong or weak and maximizing q(U,L) = {m_s(U,L) + λ m_w(U,L)}/{|U|} with λ ∈ [0,1]. It proves NP-hardness for 0 ≤ λ < 1 and shows λ=1 yields a polynomial-time densest-subgraph formulation, while λ=0 corresponds to Max-Clique; to solve STC-den, it presents an exact ILP (STC-ILP), an LP-relaxation (STC-LP) with rounding, and four practical heuristics including STC-Cut, STC-Peel, and a continuous-relabelling peeling variant. Empirical results on synthetic and real networks demonstrate that the methods can recover ground-truth dense components, with STC-ILP delivering the best scores on small graphs and STC-Cut/STC-Peel offering scalable performance on larger networks; a DBLP case study confirms the approach yields interpretable, densely connected, STC-compliant subgraphs. Overall, the work provides a principled framework combining STC with density optimization, enabling robust community-like subgraph discovery and suggesting directions for extending to other density notions and weighted settings.

Abstract

Finding dense subgraphs is a core problem with numerous graph mining applications such as community detection in social networks and anomaly detection. However, in many real-world networks connections are not equal. One way to label edges as either strong or weak is to use strong triadic closure~(STC). Here, if one node connects strongly with two other nodes, then those two nodes should be connected at least with a weak edge. STC-labelings are not unique and finding the maximum number of strong edges is NP-hard. In this paper, we apply STC to dense subgraph discovery. More formally, our score for a given subgraph is the ratio between the sum of the number of strong edges and weak edges, weighted by a user parameter $λ$, and the number of nodes of the subgraph. Our goal is to find a subgraph and an STC-labeling maximizing the score. We show that for $λ= 1$, our problem is equivalent to finding the densest subgraph, while for $λ= 0$, our problem is equivalent to finding the largest clique, making our problem NP-hard. We propose an exact algorithm based on integer linear programming and four practical polynomial-time heuristics. We present an extensive experimental study that shows that our algorithms can find the ground truth in synthetic datasets and run efficiently in real-world datasets.

Dense Subgraph Discovery Meets Strong Triadic Closure

TL;DR

The paper addresses dense subgraph discovery under the strong triadic closure (STC) constraint by labeling edges as strong or weak and maximizing q(U,L) = {m_s(U,L) + λ m_w(U,L)}/{|U|} with λ ∈ [0,1]. It proves NP-hardness for 0 ≤ λ < 1 and shows λ=1 yields a polynomial-time densest-subgraph formulation, while λ=0 corresponds to Max-Clique; to solve STC-den, it presents an exact ILP (STC-ILP), an LP-relaxation (STC-LP) with rounding, and four practical heuristics including STC-Cut, STC-Peel, and a continuous-relabelling peeling variant. Empirical results on synthetic and real networks demonstrate that the methods can recover ground-truth dense components, with STC-ILP delivering the best scores on small graphs and STC-Cut/STC-Peel offering scalable performance on larger networks; a DBLP case study confirms the approach yields interpretable, densely connected, STC-compliant subgraphs. Overall, the work provides a principled framework combining STC with density optimization, enabling robust community-like subgraph discovery and suggesting directions for extending to other density notions and weighted settings.

Abstract

Finding dense subgraphs is a core problem with numerous graph mining applications such as community detection in social networks and anomaly detection. However, in many real-world networks connections are not equal. One way to label edges as either strong or weak is to use strong triadic closure~(STC). Here, if one node connects strongly with two other nodes, then those two nodes should be connected at least with a weak edge. STC-labelings are not unique and finding the maximum number of strong edges is NP-hard. In this paper, we apply STC to dense subgraph discovery. More formally, our score for a given subgraph is the ratio between the sum of the number of strong edges and weak edges, weighted by a user parameter , and the number of nodes of the subgraph. Our goal is to find a subgraph and an STC-labeling maximizing the score. We show that for , our problem is equivalent to finding the densest subgraph, while for , our problem is equivalent to finding the largest clique, making our problem NP-hard. We propose an exact algorithm based on integer linear programming and four practical polynomial-time heuristics. We present an extensive experimental study that shows that our algorithms can find the ground truth in synthetic datasets and run efficiently in real-world datasets.

Paper Structure

This paper contains 15 sections, 13 theorems, 10 equations, 3 figures, 3 tables, 4 algorithms.

Key Result

proposition 1

For $\lambda=0$, stc-den is NP-hard.

Figures (3)

  • Figure 1: Strong (Red) and weak (Blue) edges of the Karate club dataset maximizing the number of strong edges (a), $\lambda = 0$ (b), and $\lambda = 0.5$ (c) using our integer linear program based algorithm (STC-ILP). We define our score as the sum of the number of strong and weak edges weighted by a parameter $\lambda$, divided by the size of the subgraph. The scores are $2.0$ and $2.06$ for (b) and (c), respectively. We see that (b) is a clique of size $5$.
  • Figure 2: Scores and percentages of strong edges as a function of $\lambda$ for Synthetic dataset.
  • Figure 3: Time in seconds as a function of the number of edges ${\left|E\right|}$ and the number of wedges ${\left|V(Z)\right|}$.

Theorems & Definitions (13)

  • proposition 1
  • proposition 2
  • proposition 3
  • proposition 4
  • proposition 5
  • proposition 6
  • proposition 7
  • proposition 8
  • proposition 9
  • proposition 10
  • ...and 3 more