Table of Contents
Fetching ...

Faster maximal clique enumeration in large real-world link streams

Alexis Baudin, Clémence Magnien, Lionel Tabourier

TL;DR

This work addresses the problem of enumerating maximal cliques in large real-world link streams, where a clique is a set of vertices interacting throughout a time interval. It introduces a Bron-Kerbosch–based algorithm that enumerates time-maximal cliques by operating on instantaneous graphs $G_t$ and then filters for vertex-maximality using a final-time test, augmented by a pivoting strategy to prune the search. The authors provide correctness proofs and two complexity viewpoints: a parameterized input-bound and an output-sensitive bound, showing favorable scaling properties. Empirically, the approach significantly outperforms previous methods, enabling enumeration on massive datasets (up to $10^8$ links) with speedups up to $10^4$ on C++, and it supports parallel execution. The work advances practical clique enumeration in temporal networks, with implications for density-based analysis, community discovery, and dynamic subgraph mining in large real-world systems.

Abstract

Link streams offer a good model for representing interactions over time. They consist of links $(b,e,u,v)$, where $u$ and $v$ are vertices interacting during the whole time interval $[b,e]$. In this paper, we deal with the problem of enumerating maximal cliques in link streams. A clique is a pair $(C,[t_0,t_1])$, where $C$ is a set of vertices that all interact pairwise during the full interval $[t_0,t_1]$. It is maximal when neither its set of vertices nor its time interval can be increased. Some of the main works solving this problem are based on the famous Bron-Kerbosch algorithm for enumerating maximal cliques in graphs. We take this idea as a starting point to propose a new algorithm which matches the cliques of the instantaneous graphs formed by links existing at a given time $t$ to the maximal cliques of the link stream. We prove its validity and compute its complexity, which is better than the state-of-the art ones in many cases of interest. We also study the output-sensitive complexity, which is close to the output size, thereby showing that our algorithm is efficient. To confirm this, we perform experiments on link streams used in the state of the art, and on massive link streams, up to 100 million links. In all cases our algorithm is faster, mostly by a factor of at least 10 and up to a factor of $10^4$. Moreover, it scales to massive link streams for which the existing algorithms are not able to provide the solution.

Faster maximal clique enumeration in large real-world link streams

TL;DR

This work addresses the problem of enumerating maximal cliques in large real-world link streams, where a clique is a set of vertices interacting throughout a time interval. It introduces a Bron-Kerbosch–based algorithm that enumerates time-maximal cliques by operating on instantaneous graphs and then filters for vertex-maximality using a final-time test, augmented by a pivoting strategy to prune the search. The authors provide correctness proofs and two complexity viewpoints: a parameterized input-bound and an output-sensitive bound, showing favorable scaling properties. Empirically, the approach significantly outperforms previous methods, enabling enumeration on massive datasets (up to links) with speedups up to on C++, and it supports parallel execution. The work advances practical clique enumeration in temporal networks, with implications for density-based analysis, community discovery, and dynamic subgraph mining in large real-world systems.

Abstract

Link streams offer a good model for representing interactions over time. They consist of links , where and are vertices interacting during the whole time interval . In this paper, we deal with the problem of enumerating maximal cliques in link streams. A clique is a pair , where is a set of vertices that all interact pairwise during the full interval . It is maximal when neither its set of vertices nor its time interval can be increased. Some of the main works solving this problem are based on the famous Bron-Kerbosch algorithm for enumerating maximal cliques in graphs. We take this idea as a starting point to propose a new algorithm which matches the cliques of the instantaneous graphs formed by links existing at a given time to the maximal cliques of the link stream. We prove its validity and compute its complexity, which is better than the state-of-the art ones in many cases of interest. We also study the output-sensitive complexity, which is close to the output size, thereby showing that our algorithm is efficient. To confirm this, we perform experiments on link streams used in the state of the art, and on massive link streams, up to 100 million links. In all cases our algorithm is faster, mostly by a factor of at least 10 and up to a factor of . Moreover, it scales to massive link streams for which the existing algorithms are not able to provide the solution.
Paper Structure (28 sections, 10 theorems, 6 equations, 6 figures, 5 tables, 2 algorithms)

This paper contains 28 sections, 10 theorems, 6 equations, 6 figures, 5 tables, 2 algorithms.

Key Result

Lemma 1

$(C,\left[t_0,t_1\right])$ is a time-maximal clique if and only if:

Figures (6)

  • Figure 1: Left: a link stream with interaction time on the abscissa and vertices on the ordinate. For example, there is a link between $b$ and $c$ during the time interval $[1,5]$. The maximal cliques are represented in color, e.g., on the interval $[2,4]$, the three vertices $a,b,c$ are linked together, and form a maximal clique $(\{a,b,c\},[2,4])$. Right: the instantaneous graph $G_3$ of this link stream at $t = 3$.
  • Figure 2: A link stream with the two maximal cliques that start at $t=3$ in color, and its instantaneous graph $G_3$ with the edges that start at $t=3$ in blue. There are three time-maximal cliques that start at this instant, and each contains a new edge (in blue): $(\{a,b\},[3,7])$, $(\{b,c\},[3,5])$ and $(\{a,b,c\},[3,5])$. The clique $(\{b,c\},[3,5])$ is not vertex-maximal, while the two others are.
  • Figure 3: General structure of Algorithm \ref{['algo:framework']}: for each time $t \in T$, it enumerates the maximal cliques that start at $t$.
  • Figure 4: Example of a clique enumeration using $ForbidEdges$. An instantaneous graph $G_t$ of a link stream $L$ at time $t$ is represented. The thick red edges correspond to a link that begins at $t$, while the others correspond to links that have begun earlier. The clique $\{a,b,c,d\}$ contains the two new edges $\{a,c\}$ and $\{b,d\}$, which is why there is a need for the set $ForbidEdges$ to avoid enumerating it twice.
  • Figure 5: Left: Summary of the computation times of maximal clique enumerations as a function of the number $m$ of links for all link stream datasets in Tables \ref{['tab:time-all-bentert']} and \ref{['tab:time-all-bigLS']}. The three lines at the top represent enumerations that are interrupted because they exceed 24 hours or 380 GB of RAM. Right: Speed-up factor of our implementations with respect to the fastest state-of-the-art method, as a function of the number $m$ of links. There is one point per dataset where at least one state-of-the-art algorithm finishes in less than 24 hours and using less than 380 GB RAM.
  • ...and 1 more figures

Theorems & Definitions (28)

  • Definition 1: Link stream
  • Definition 2: Clique of a link stream
  • Definition 3: Time-maximal clique
  • Definition 4: Vertex-maximal clique
  • Definition 5: Maximal Clique
  • Definition 6: Instantaneous graph $G_t$ associated to a link stream at time $t$
  • Definition 7: End time $\mathcal{E}_{t}(u,v)$ of an edge $\{u,v\}$ of ${G_t}$
  • Definition 8: Final time $\mathcal{E}_{t}(C)$ of a clique $C$ of $G_t$
  • Lemma 1: Time-maximality of a clique
  • proof
  • ...and 18 more