Table of Contents
Fetching ...

HiGen: Hierarchical Graph Generative Networks

Mahdi Karami

TL;DR

HiGen presents a hierarchical, coarse-to-fine graph generative framework that explicitly models communities and cross-community links to capture multi-scale structure. Edge weights at each level are generated with a multinomial distribution and a recursive stick-breaking factorization enables autoregressive, integer-valued weights, while parallel community generation and separate bipartite prediction provide scalability. Leveraging GraphGPS-based GNNs and mixture models, HiGen achieves state-of-the-art graph quality across benchmark datasets and demonstrates robust performance with reduced sensitivity to node ordering. The approach offers strong practical impact for scalable generation of large, structured graphs and lays groundwork for extensions to attributed graphs and multi-type edges, with reproducibility resources provided.

Abstract

Most real-world graphs exhibit a hierarchical structure, which is often overlooked by existing graph generation methods. To address this limitation, we propose a novel graph generative network that captures the hierarchical nature of graphs and successively generates the graph sub-structures in a coarse-to-fine fashion. At each level of hierarchy, this model generates communities in parallel, followed by the prediction of cross-edges between communities using separate neural networks. This modular approach enables scalable graph generation for large and complex graphs. Moreover, we model the output distribution of edges in the hierarchical graph with a multinomial distribution and derive a recursive factorization for this distribution. This enables us to generate community graphs with integer-valued edge weights in an autoregressive manner. Empirical studies demonstrate the effectiveness and scalability of our proposed generative model, achieving state-ofthe-art performance in terms of graph quality across various benchmark datasets. The code is available at https://github.com/Karami-m/HiGen_main.

HiGen: Hierarchical Graph Generative Networks

TL;DR

HiGen presents a hierarchical, coarse-to-fine graph generative framework that explicitly models communities and cross-community links to capture multi-scale structure. Edge weights at each level are generated with a multinomial distribution and a recursive stick-breaking factorization enables autoregressive, integer-valued weights, while parallel community generation and separate bipartite prediction provide scalability. Leveraging GraphGPS-based GNNs and mixture models, HiGen achieves state-of-the-art graph quality across benchmark datasets and demonstrates robust performance with reduced sensitivity to node ordering. The approach offers strong practical impact for scalable generation of large, structured graphs and lays groundwork for extensions to attributed graphs and multi-type edges, with reproducibility resources provided.

Abstract

Most real-world graphs exhibit a hierarchical structure, which is often overlooked by existing graph generation methods. To address this limitation, we propose a novel graph generative network that captures the hierarchical nature of graphs and successively generates the graph sub-structures in a coarse-to-fine fashion. At each level of hierarchy, this model generates communities in parallel, followed by the prediction of cross-edges between communities using separate neural networks. This modular approach enables scalable graph generation for large and complex graphs. Moreover, we model the output distribution of edges in the hierarchical graph with a multinomial distribution and derive a recursive factorization for this distribution. This enables us to generate community graphs with integer-valued edge weights in an autoregressive manner. Empirical studies demonstrate the effectiveness and scalability of our proposed generative model, achieving state-ofthe-art performance in terms of graph quality across various benchmark datasets. The code is available at https://github.com/Karami-m/HiGen_main.
Paper Structure (41 sections, 9 theorems, 25 equations, 7 figures, 13 tables, 2 algorithms)

This paper contains 41 sections, 9 theorems, 25 equations, 7 figures, 13 tables, 2 algorithms.

Key Result

Theorem 3.1

Let the random vector ${\mathbf{w}} := [w_e]_{e ~\in~ {\mathcal{E}}({{\mathcal{G}}^l})}$ denote the set of weights of all edges of ${\mathcal{G}}^l$ such that their sum is $w_0 = \bm{1}^{T}~{\mathbf{w}}$. The joint probability of ${\mathbf{w}}$ can be described by a multinomial distribution: ${\math where $\{ {\bm{\theta}}^{l}_{ij}[e] \in [0, 1], ~ \text{s.t.} ~ \bm{1}^T {\bm{\theta}}^{l}_{ij} =

Figures (7)

  • Figure 1: (a) A sample hierarchical graph, ${\mathcal{HG}}$ with 2 levels is shown. Communities are shown in different colors and the weight of a node and the weight of an edge in a higher level, represent the sum of the edges in the corresponding community and bipartite, respectively. Node size and edge width indicate their weights. (b) The matrix shows corresponding adjacency matrix of the graph at the leaf level, ${\mathcal{G}}^2$, where each of its sub-graphs corresponds to a block in the adjacency matrix, communities correspond to diagonal blocks and are shown in different colors while bipartites are colored in gray. (c) Decomposition of multinomial distribution as a recursive stick-breaking process where at each iteration, first a fraction of the remaining weights ${\textnormal{r}}_t$ is allocated to the $t$-th row (corresponding to the $t$-th node in the sub-graph) and then this fraction, ${\textnormal{v}}_t$, is distributed among that row of lower triangular adjacency matrix, $\hat{A}$. (d), (e) Parallel generation of communities and bipartites, respectively. Shadowed lines are the augmented edges representing candidate edges at each step.
  • Figure 2: Comparison of generation metrics on benchmark 3D point cloud. The baseline results are from liao2019GRAN.
  • Figure 2: An illustration of the generation process of the single community in level $l=1$ of ${\mathcal{HG}}$ in Figure \ref{['fig:HG']} according to Theorem \ref{['thm:mn2bnmn']}, and equation (\ref{['eq:mix_mn_pg']}). The total weight of this community graph is 29, determined by the parent node of this community. Consequently, the edge probabilities of this community follow a Multinomial distribution. This Multinomial is formed as an autoregressive (AR) process and decomposed to a sequence of Binomials and Multinomials, as outlined in \ref{['thm:mn2bnmn']}. At each iteration of this stick-breaking process, first a fraction of the remaining weights ${\textnormal{r}}_t$ is allocated to the $t$-th row (corresponding to the $t$-th node in the sub-graph) and then this fraction, ${\textnormal{v}}_t$, is distributed among that row of lower triangular adjacency matrix, $\hat{A}$.
  • Figure 3: Samples from HiGen trained on Protein and SBM. Communities are distinguished with different colors and both levels are depicted. The samples for GRAN and SPECRE are obtained from martinkus2022spectre.
  • Figure 4: Samples from HiGen trained on Protein and SBM. Communities are distinguished with different colors and both levels are depicted. The samples for GDSS are obtained from jo2022scoreBased.
  • ...and 2 more figures

Theorems & Definitions (12)

  • Theorem 3.1
  • Theorem 3.2
  • Theorem B.1
  • proof
  • Lemma 1
  • proof
  • Lemma 2
  • Lemma 3
  • Theorem B.2
  • proof
  • ...and 2 more