Table of Contents
Fetching ...

FR-NAS: Forward-and-Reverse Graph Predictor for Efficient Neural Architecture Search

Haoming Zhang, Ran Cheng

TL;DR

This work introduces a novel GNN predictor for NAS that renders neural architectures into vector representations by combining both the conventional and inverse graph views, and incorporates a customized training loss within the GNN predictor to ensure efficient utilization of both types of representations.

Abstract

Neural Architecture Search (NAS) has emerged as a key tool in identifying optimal configurations of deep neural networks tailored to specific tasks. However, training and assessing numerous architectures introduces considerable computational overhead. One method to mitigating this is through performance predictors, which offer a means to estimate the potential of an architecture without exhaustive training. Given that neural architectures fundamentally resemble Directed Acyclic Graphs (DAGs), Graph Neural Networks (GNNs) become an apparent choice for such predictive tasks. Nevertheless, the scarcity of training data can impact the precision of GNN-based predictors. To address this, we introduce a novel GNN predictor for NAS. This predictor renders neural architectures into vector representations by combining both the conventional and inverse graph views. Additionally, we incorporate a customized training loss within the GNN predictor to ensure efficient utilization of both types of representations. We subsequently assessed our method through experiments on benchmark datasets including NAS-Bench-101, NAS-Bench-201, and the DARTS search space, with a training dataset ranging from 50 to 400 samples. Benchmarked against leading GNN predictors, the experimental results showcase a significant improvement in prediction accuracy, with a 3%--16% increase in Kendall-tau correlation. Source codes are available at https://github.com/EMI-Group/fr-nas.

FR-NAS: Forward-and-Reverse Graph Predictor for Efficient Neural Architecture Search

TL;DR

This work introduces a novel GNN predictor for NAS that renders neural architectures into vector representations by combining both the conventional and inverse graph views, and incorporates a customized training loss within the GNN predictor to ensure efficient utilization of both types of representations.

Abstract

Neural Architecture Search (NAS) has emerged as a key tool in identifying optimal configurations of deep neural networks tailored to specific tasks. However, training and assessing numerous architectures introduces considerable computational overhead. One method to mitigating this is through performance predictors, which offer a means to estimate the potential of an architecture without exhaustive training. Given that neural architectures fundamentally resemble Directed Acyclic Graphs (DAGs), Graph Neural Networks (GNNs) become an apparent choice for such predictive tasks. Nevertheless, the scarcity of training data can impact the precision of GNN-based predictors. To address this, we introduce a novel GNN predictor for NAS. This predictor renders neural architectures into vector representations by combining both the conventional and inverse graph views. Additionally, we incorporate a customized training loss within the GNN predictor to ensure efficient utilization of both types of representations. We subsequently assessed our method through experiments on benchmark datasets including NAS-Bench-101, NAS-Bench-201, and the DARTS search space, with a training dataset ranging from 50 to 400 samples. Benchmarked against leading GNN predictors, the experimental results showcase a significant improvement in prediction accuracy, with a 3%--16% increase in Kendall-tau correlation. Source codes are available at https://github.com/EMI-Group/fr-nas.
Paper Structure (19 sections, 6 equations, 5 figures, 1 table)

This paper contains 19 sections, 6 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Framework of our FR-NAS. The architectures are assessed by the performance predictor and represented using both forward and reverse graph encodings (blue arrows). These are processed by two GIN encoders into feature vectors, which are then fed to MLPs. Two training losses, Instance Relationship Graph (IRG) and Mean Square Error (MSE) losses are incorporated, targeting features and predictions respectively (red arrows).
  • Figure 2: Differences in the IRG matrix of embedding vectors when trained without the proposed feature loss, using 50, 200, and 400 samples from NAS-Bench-201, respectively.
  • Figure 3: Differences in the IRG matrix of embedding vectors when trained with(out) the proposed feature loss, using 50 and 400 samples from DARTS search space, respectively.
  • Figure 4: Parameter sensitivity analysis of the weight coefficient $\lambda$.
  • Figure 5: Comparison of variants on three benchmark datasets. NPENAS-Forward: employing the same architecture as FR-NAS but taking only the forward direction of graph encoding as inputs. NPENAS-FR: using the same inputs and architecture as FR-NAS but without using our training method. NPNAS-DirGIN: a GIN adaptation of NPNAS featuring bidirectional GIN layers.