Table of Contents
Fetching ...

Building Optimal Neural Architectures using Interpretable Knowledge

Keith G. Mills, Fred X. Han, Mohammad Salameh, Shengyao Lu, Chunhua Zhou, Jiao He, Fengyu Sun, Di Niu

TL;DR

The paper tackles the high cost of Neural Architecture Search by presenting AutoBuild, a method that learns to rank architectural building blocks through a magnitude-ranked, hop-aware embedding space learned by graph neural networks. By aligning subgraph and node embeddings with end-task performance via a differentiable ranking loss and a feature embedding module, AutoBuild assigns interpretable importance scores to architectural modules and can directly assemble high-performing architectures or prune search spaces. The approach is demonstrated across ImageNet macro-search spaces, panoptic segmentation, and generative AI with limited evaluations, showing improved Pareto fronts and FID metrics with far fewer evaluated architectures. This work reduces NAS cost while providing interpretable guidance on which modules and features drive performance, enabling more efficient and scalable architecture design in CV and generative modeling tasks.

Abstract

Neural Architecture Search is a costly practice. The fact that a search space can span a vast number of design choices with each architecture evaluation taking nontrivial overhead makes it hard for an algorithm to sufficiently explore candidate networks. In this paper, we propose AutoBuild, a scheme which learns to align the latent embeddings of operations and architecture modules with the ground-truth performance of the architectures they appear in. By doing so, AutoBuild is capable of assigning interpretable importance scores to architecture modules, such as individual operation features and larger macro operation sequences such that high-performance neural networks can be constructed without any need for search. Through experiments performed on state-of-the-art image classification, segmentation, and Stable Diffusion models, we show that by mining a relatively small set of evaluated architectures, AutoBuild can learn to build high-quality architectures directly or help to reduce search space to focus on relevant areas, finding better architectures that outperform both the original labeled ones and ones found by search baselines. Code available at https://github.com/Ascend-Research/AutoBuild

Building Optimal Neural Architectures using Interpretable Knowledge

TL;DR

The paper tackles the high cost of Neural Architecture Search by presenting AutoBuild, a method that learns to rank architectural building blocks through a magnitude-ranked, hop-aware embedding space learned by graph neural networks. By aligning subgraph and node embeddings with end-task performance via a differentiable ranking loss and a feature embedding module, AutoBuild assigns interpretable importance scores to architectural modules and can directly assemble high-performing architectures or prune search spaces. The approach is demonstrated across ImageNet macro-search spaces, panoptic segmentation, and generative AI with limited evaluations, showing improved Pareto fronts and FID metrics with far fewer evaluated architectures. This work reduces NAS cost while providing interpretable guidance on which modules and features drive performance, enabling more efficient and scalable architecture design in CV and generative modeling tasks.

Abstract

Neural Architecture Search is a costly practice. The fact that a search space can span a vast number of design choices with each architecture evaluation taking nontrivial overhead makes it hard for an algorithm to sufficiently explore candidate networks. In this paper, we propose AutoBuild, a scheme which learns to align the latent embeddings of operations and architecture modules with the ground-truth performance of the architectures they appear in. By doing so, AutoBuild is capable of assigning interpretable importance scores to architecture modules, such as individual operation features and larger macro operation sequences such that high-performance neural networks can be constructed without any need for search. Through experiments performed on state-of-the-art image classification, segmentation, and Stable Diffusion models, we show that by mining a relatively small set of evaluated architectures, AutoBuild can learn to build high-quality architectures directly or help to reduce search space to focus on relevant areas, finding better architectures that outperform both the original labeled ones and ones found by search baselines. Code available at https://github.com/Ascend-Research/AutoBuild
Paper Structure (25 sections, 6 equations, 18 figures, 3 tables)

This paper contains 25 sections, 6 equations, 18 figures, 3 tables.

Figures (18)

  • Figure 1: Visualization of a sequence-like architecture DAG. Nodes annotated with position and layer information: 'u' and 'l' refer to stage and layer position, while 'conv' refers to the type of layer. Left graph: Subgraph rooted at the red node induced by a 1-hop message passing layer. Right graph: Additional (orange) nodes incorporated into the subgraph for a 3-hop layer.
  • Figure 2: Test SRCC for PN and MBv3 test sets. Specifically, we train AutoBuild predictors using Eq. \ref{['eq:loss']} for accuracy or inference latency on several hardware devices, then measure SRCC for every hop level $m \in [0, 4]$ and the predictor MLP (y').
  • Figure 3: Results of training an accuracy predictor for PN and MBv3 using the 'IR' representation format. Top-left: Accuracy histogram. Bottom-left: Hop-wise AutoBuild test SRCC. Right: Operation-type importance scores from the 0-hop FE-MLP. Note: 'GAP' means Global Average Pool and 'BN' means Batch Norm.
  • Figure 4: Best and worst MBConv layer subgraphs for MobileNetV3 on GPU latency, annotated with the stage where they are found. Specifically, we illustrate the best and worst subgraphs, by AutoBuild module score, when different equations are used to compute the predictor targets using accuracy and latency.
  • Figure 5: Top row: Comparing reduced search spaces produced by AutoBuild (colored clusters; K=3) to the accuracy-latency Pareto frontier of the architectures used to train the predictor (grey line). Each cluster corresponds to a specific target equation designed to improve upon a specific region of the frontier. Bottom row: Evolutionary search results comparing the Unit-Reduce search space (K=25) of AutoBuild to other search spaces and mutation techniques. Best viewed in color.
  • ...and 13 more figures