Table of Contents
Fetching ...

Enumerating Graphlets with Amortized Time Complexity Independent of Graph Size

Alessio Conte, Roberto Grossi, Yasuaki Kobayashi, Kazuhiro Kurita, Davide Rucci, Takeaki Uno, Kunihiro Wasa

TL;DR

This paper provides the first algorithm to list all graphlets of order k in a graph G, and shows that it is possible to list k-graphlets in O(k2) time per solution, and to list edge k-graphlets in O(k) time per solution.

Abstract

Graphlets of order $k$ in a graph $G$ are connected subgraphs induced by $k$ nodes (called $k$-graphlets) or by $k$ edges (called edge $k$-graphlets). They are among the interesting subgraphs in network analysis to get insights on both the local and global structure of a network. While several algorithms exist for discovering and enumerating graphlets, the cost per solution of such algorithms typically depends on the size of the graph $G$, or its maximum degree. In real networks, even the latter can be in the order of millions, whereas $k$ is typically required to be a small value. In this paper we provide the first algorithm to list all graphlets of order $k$ in a graph $G=(V,E)$ with an amortized cost per solution depending \emph{solely} on the order $k$, contrarily to previous approaches where the cost depends \emph{also} on the size of $G$ or its maximum degree. Specifically, we show that it is possible to list $k$-graphlets in $O(k^2)$ time per solution, and to list edge $k$-graphlets in $O(k)$ time per solution. Furthermore we show that, if the input graph has bounded degree, then the cost per solution for listing $k$-graphlets is reduced to $O(k)$. Whenever $k = O(1)$, as it is often the case in practical settings, these algorithms are the first to achieve constant time per solution.

Enumerating Graphlets with Amortized Time Complexity Independent of Graph Size

TL;DR

This paper provides the first algorithm to list all graphlets of order k in a graph G, and shows that it is possible to list k-graphlets in O(k2) time per solution, and to list edge k-graphlets in O(k) time per solution.

Abstract

Graphlets of order in a graph are connected subgraphs induced by nodes (called -graphlets) or by edges (called edge -graphlets). They are among the interesting subgraphs in network analysis to get insights on both the local and global structure of a network. While several algorithms exist for discovering and enumerating graphlets, the cost per solution of such algorithms typically depends on the size of the graph , or its maximum degree. In real networks, even the latter can be in the order of millions, whereas is typically required to be a small value. In this paper we provide the first algorithm to list all graphlets of order in a graph with an amortized cost per solution depending \emph{solely} on the order , contrarily to previous approaches where the cost depends \emph{also} on the size of or its maximum degree. Specifically, we show that it is possible to list -graphlets in time per solution, and to list edge -graphlets in time per solution. Furthermore we show that, if the input graph has bounded degree, then the cost per solution for listing -graphlets is reduced to . Whenever , as it is often the case in practical settings, these algorithms are the first to achieve constant time per solution.
Paper Structure (20 sections, 21 theorems, 4 equations, 2 figures, 5 algorithms)

This paper contains 20 sections, 21 theorems, 4 equations, 2 figures, 5 algorithms.

Key Result

Theorem 1

If, for every internal node $X$ of $\mathcal{T}$, it holds that where $\alpha > 1$ and $\beta \ge 0$ are constants and $T^*$ is the worst case running time of processing any leaf node of $\mathcal{T}$, then, the amortized time of each node $X$ in $\mathcal{T}$ is $O(T^*)$.

Figures (2)

  • Figure 1: (a) A graph $G=(V,E)$ with $|V|=4$ vertices and $|E|=5$ edges. (b) All 3-graphlets contained in $G$. (c) All edge 3-subgraphs, i.e., edge $3$-graphlets in $G$, where dotted edges denote subgraphs that are also acyclic (3-subtrees). Note that some of the $k$-graphlets are also edge $k$-graphlets, but this is not necessarily so; also, some edge $k$-graphlets have a number of vertices different from $k$.
  • Figure 2: Amortizing the cost of the recursive call $X$ on the right path of $Y$: we charge $O(1)$ to each call on the nodes highlighted by the bracket. These calls are at least $d_{G_X}(z) - 2k$, so we can amortize the $O(d_{G_X}(z))$ cost on them.

Theorems & Definitions (38)

  • Theorem 1: The PO condition DBLP:conf/wads/Uno15
  • Lemma 2
  • proof
  • Theorem 2
  • proof
  • Theorem 3
  • proof
  • Lemma 3
  • proof
  • Theorem 4
  • ...and 28 more