Table of Contents
Fetching ...

LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation

Mufei Li, Viraj Shitole, Eli Chien, Changhai Man, Zhaodong Wang, Srinivas Sridharan, Ying Zhang, Tushar Krishna, Pan Li

TL;DR

LayerDAG introduces a layerwise autoregressive diffusion approach to DAG generation by decomposing a DAG into a sequence of bipartite graphs and modeling each layer's node attributes and edges with diffusion conditioned on prior layers. This combines autoregressive modeling for directional dependencies with diffusion-driven refinement for intra-layer dependencies, while preserving permutation invariance through a BiMPNN encoder and set-based predictions. Empirical results show LayerDAG outperforms autoregressive and diffusion baselines on synthetic and real DAG datasets, scales to hundreds of nodes, enables conditional generation with unseen label regimes, and yields synthetic DAGs that improve ML surrogate models used for system benchmarking. The method thus enables scalable, IP-preserving DAG generation with practical impact on hardware/software co-design and performance benchmarking through high-quality synthetic graphs.

Abstract

Directed acyclic graphs (DAGs) serve as crucial data representations in domains such as hardware synthesis and compiler/program optimization for computing systems. DAG generative models facilitate the creation of synthetic DAGs, which can be used for benchmarking computing systems while preserving intellectual property. However, generating realistic DAGs is challenging due to their inherent directional and logical dependencies. This paper introduces LayerDAG, an autoregressive diffusion model, to address these challenges. LayerDAG decouples the strong node dependencies into manageable units that can be processed sequentially. By interpreting the partial order of nodes as a sequence of bipartite graphs, LayerDAG leverages autoregressive generation to model directional dependencies and employs diffusion models to capture logical dependencies within each bipartite graph. Comparative analyses demonstrate that LayerDAG outperforms existing DAG generative models in both expressiveness and generalization, particularly for generating large-scale DAGs with up to 400 nodes-a critical scenario for system benchmarking. Extensive experiments on both synthetic and real-world flow graphs from various computing platforms show that LayerDAG generates valid DAGs with superior statistical properties and benchmarking performance. The synthetic DAGs generated by LayerDAG enhance the training of ML-based surrogate models, resulting in improved accuracy in predicting performance metrics of real-world DAGs across diverse computing platforms.

LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation

TL;DR

LayerDAG introduces a layerwise autoregressive diffusion approach to DAG generation by decomposing a DAG into a sequence of bipartite graphs and modeling each layer's node attributes and edges with diffusion conditioned on prior layers. This combines autoregressive modeling for directional dependencies with diffusion-driven refinement for intra-layer dependencies, while preserving permutation invariance through a BiMPNN encoder and set-based predictions. Empirical results show LayerDAG outperforms autoregressive and diffusion baselines on synthetic and real DAG datasets, scales to hundreds of nodes, enables conditional generation with unseen label regimes, and yields synthetic DAGs that improve ML surrogate models used for system benchmarking. The method thus enables scalable, IP-preserving DAG generation with practical impact on hardware/software co-design and performance benchmarking through high-quality synthetic graphs.

Abstract

Directed acyclic graphs (DAGs) serve as crucial data representations in domains such as hardware synthesis and compiler/program optimization for computing systems. DAG generative models facilitate the creation of synthetic DAGs, which can be used for benchmarking computing systems while preserving intellectual property. However, generating realistic DAGs is challenging due to their inherent directional and logical dependencies. This paper introduces LayerDAG, an autoregressive diffusion model, to address these challenges. LayerDAG decouples the strong node dependencies into manageable units that can be processed sequentially. By interpreting the partial order of nodes as a sequence of bipartite graphs, LayerDAG leverages autoregressive generation to model directional dependencies and employs diffusion models to capture logical dependencies within each bipartite graph. Comparative analyses demonstrate that LayerDAG outperforms existing DAG generative models in both expressiveness and generalization, particularly for generating large-scale DAGs with up to 400 nodes-a critical scenario for system benchmarking. Extensive experiments on both synthetic and real-world flow graphs from various computing platforms show that LayerDAG generates valid DAGs with superior statistical properties and benchmarking performance. The synthetic DAGs generated by LayerDAG enhance the training of ML-based surrogate models, resulting in improved accuracy in predicting performance metrics of real-world DAGs across diverse computing platforms.

Paper Structure

This paper contains 24 sections, 1 theorem, 2 equations, 4 figures, 5 tables.

Key Result

Proposition 3.1

For any depth $l$, $p_{\theta}\left(|\mathcal{V}^{(l+1)}| \mid G^{(\leq l)} \right)$, $p_{\theta}\left(\mathbf{X}^{(l+1)} \mid G^{(\leq l)}, |\mathcal{V}^{(l+1)}|\right)$, and $p_{\theta}\left(\mathbf{A}^{(l+1)} \mid G^{(\leq l)}, \mathbf{X}^{(l+1)}\right)$ are permutation invariant. Hence, LayerDA

Figures (4)

  • Figure 1: (a) A real-world DAG (the computation flow for a transformer layer NIPS2017_3f5ee243) encompasses complex logical and directional dependencies. Examples of logical dependencies include 1) dimension matching in matrix multiplications and 2) exactly two matrices pointed to a $\times$ operation. One example of directional dependencies here is softmax(qk)v being computed after qk. (b) Each DAG has a unique layerwise partition, an ordered partition of nodes/edges into a sequence of bipartite graphs. In LayerDAG, each bipartite graph $G^{(l+1)}$ is generated by a diffusion model conditioned on $G^{(\leq l)}$. LayerDAG generates in order the number of new nodes, their attributes, and the new edges.
  • Figure 2: Generation quality with respect to time budget for LP $(\rho=0)$, TPU Tile, and HLS.
  • Figure 3: Layer size distribution in the real-world datasets.
  • Figure 4: Scatter plot of the sorted labels. The label distributions in the real-world datasets are long-tailed.

Theorems & Definitions (1)

  • Proposition 3.1: permutation invariance of LayerDAG