TART: Token-based Architecture Transformer for Neural Network Performance Prediction

Yannis Y. He

TART: Token-based Architecture Transformer for Neural Network Performance Prediction

Yannis Y. He

TL;DR

The Token-based Architecture Transformer (TART), which predicts neural network performance without the need to train candidate networks and attains state-of-the-art performance on the DeepNets-1M dataset for performance prediction tasks without edge information, indicating the potential of Transformers to aid in discovering novel and high-performing neural architectures.

Abstract

In the realm of neural architecture design, achieving high performance is largely reliant on the manual expertise of researchers. Despite the emergence of Neural Architecture Search (NAS) as a promising technique for automating this process, current NAS methods still require human input to expand the search space and cannot generate new architectures. This paper explores the potential of Transformers in comprehending neural architectures and their performance, with the objective of establishing the foundation for utilizing Transformers to generate novel networks. We propose the Token-based Architecture Transformer (TART), which predicts neural network performance without the need to train candidate networks. TART attains state-of-the-art performance on the DeepNets-1M dataset for performance prediction tasks without edge information, indicating the potential of Transformers to aid in discovering novel and high-performing neural architectures.

TART: Token-based Architecture Transformer for Neural Network Performance Prediction

TL;DR

Abstract

Paper Structure (16 sections, 1 equation, 6 figures, 1 table)

This paper contains 16 sections, 1 equation, 6 figures, 1 table.

Introduction
Motivations
Contributions
Background
Neural Architecture Search
Parameter and Performance Prediction
Graph Representation of Neural Architectures
Token Representation of Graphs
Transformer-based Generative Model
Method
Experiments and Results
Datasets: DeepNets-1M
Experiment Design
Experiment 1: Pure-Transformer Predictor
Experiment 2: Token-based Transformer Predictor
...and 1 more sections

Figures (6)

Figure 1: Current Neural Architecture Search process.
Figure 2: TART is an end-to-end neural predictor, which has three basic stages: 1) tokenization stage, 2) transformer learning stage, and 3) prediction stage.
Figure 3: Examples of computational graphs (visualized using NetworkX NetworkX). In the visualized graphs, a node is one of the 15 primitives coded with markers shown at the bottom, where they are sorted by the frequency in the training set.
Figure 4: Measuring correlation between the predicted and ground-truth performance of models on CIFAR-10
Figure 5: Although we had to halt training due to limited computational resources, our analysis of the linear regression of performance growth suggests that the predictor had not yet overfit the data.
...and 1 more figures

TART: Token-based Architecture Transformer for Neural Network Performance Prediction

TL;DR

Abstract

TART: Token-based Architecture Transformer for Neural Network Performance Prediction

Authors

TL;DR

Abstract

Table of Contents

Figures (6)