Table of Contents
Fetching ...

DiffusionNAG: Predictor-guided Neural Architecture Generation with Diffusion Models

Sohyun An, Hayeon Lee, Jaehyeong Jo, Seanie Lee, Sung Ju Hwang

TL;DR

The proposed novel conditional Neural Architecture Generation (NAG) framework, dubbed DiffusionNAG, considers the neural architectures as directed graphs and proposes a graph diffusion model for generating them, which can flexibly generate task-optimal architectures with the desired properties for diverse tasks.

Abstract

Existing NAS methods suffer from either an excessive amount of time for repetitive sampling and training of many task-irrelevant architectures. To tackle such limitations of existing NAS methods, we propose a paradigm shift from NAS to a novel conditional Neural Architecture Generation (NAG) framework based on diffusion models, dubbed DiffusionNAG. Specifically, we consider the neural architectures as directed graphs and propose a graph diffusion model for generating them. Moreover, with the guidance of parameterized predictors, DiffusionNAG can flexibly generate task-optimal architectures with the desired properties for diverse tasks, by sampling from a region that is more likely to satisfy the properties. This conditional NAG scheme is significantly more efficient than previous NAS schemes which sample the architectures and filter them using the property predictors. We validate the effectiveness of DiffusionNAG through extensive experiments in two predictor-based NAS scenarios: Transferable NAS and Bayesian Optimization (BO)-based NAS. DiffusionNAG achieves superior performance with speedups of up to 35 times when compared to the baselines on Transferable NAS benchmarks. Furthermore, when integrated into a BO-based algorithm, DiffusionNAG outperforms existing BO-based NAS approaches, particularly in the large MobileNetV3 search space on the ImageNet 1K dataset. Code is available at https://github.com/CownowAn/DiffusionNAG.

DiffusionNAG: Predictor-guided Neural Architecture Generation with Diffusion Models

TL;DR

The proposed novel conditional Neural Architecture Generation (NAG) framework, dubbed DiffusionNAG, considers the neural architectures as directed graphs and proposes a graph diffusion model for generating them, which can flexibly generate task-optimal architectures with the desired properties for diverse tasks.

Abstract

Existing NAS methods suffer from either an excessive amount of time for repetitive sampling and training of many task-irrelevant architectures. To tackle such limitations of existing NAS methods, we propose a paradigm shift from NAS to a novel conditional Neural Architecture Generation (NAG) framework based on diffusion models, dubbed DiffusionNAG. Specifically, we consider the neural architectures as directed graphs and propose a graph diffusion model for generating them. Moreover, with the guidance of parameterized predictors, DiffusionNAG can flexibly generate task-optimal architectures with the desired properties for diverse tasks, by sampling from a region that is more likely to satisfy the properties. This conditional NAG scheme is significantly more efficient than previous NAS schemes which sample the architectures and filter them using the property predictors. We validate the effectiveness of DiffusionNAG through extensive experiments in two predictor-based NAS scenarios: Transferable NAS and Bayesian Optimization (BO)-based NAS. DiffusionNAG achieves superior performance with speedups of up to 35 times when compared to the baselines on Transferable NAS benchmarks. Furthermore, when integrated into a BO-based algorithm, DiffusionNAG outperforms existing BO-based NAS approaches, particularly in the large MobileNetV3 search space on the ImageNet 1K dataset. Code is available at https://github.com/CownowAn/DiffusionNAG.
Paper Structure (60 sections, 18 equations, 10 figures, 10 tables, 2 algorithms)

This paper contains 60 sections, 18 equations, 10 figures, 10 tables, 2 algorithms.

Figures (10)

  • Figure 1: Illustration of DiffusionNAG in Transferable NAS Scenarios. DiffusionNAG generates desired neural architectures for a given unseen task by guiding the generation process with the transferable dataset-aware performance predictor $f_{\bm{\phi}^*}(y_{\tau}|\bm{D}_\tau, \bm{A}_t)$.
  • Figure 2: The distribution of generated architectures.
  • Figure 3: Statistics of the generated architectures. Each method generates 1,000 architectures.
  • Figure 3: Comparison Results on Existing AO Strategies.Guided Gen (Ours) strategy provides a pool of candidate architectures, guiding them toward a high-performance distribution using the current population with DiffusionNAG. We report the results of multiple experiments with 10 different random seeds.
  • Figure 4: Experimental Results on Various Acquisition Functions.Ours consistently outperforms the heuristic approaches on various acquisition functions. We run experiments with 10 different random seeds.
  • ...and 5 more figures