Table of Contents
Fetching ...

SEKI: Self-Evolution and Knowledge Inspiration based Neural Architecture Search via Large Language Models

Zicheng Cai, Yaohua Tang, Yutao Lai, Hua Wang, Zhi Chen, Hao Chen

TL;DR

SEKI tackles the high cost of neural architecture search by introducing a two-stage, LLM-driven framework that requires no domain-specific data. Through Self-Evolution, SEKI iteratively refines architectures guided by performance feedback, building a knowledge repository of top designs. In Knowledge Inspiration, the LLM analyzes a subset of high-quality architectures to extract design patterns and generate new candidates, further enriching the repository. Across DARTS, NAS201, and Trans101 benchmarks, SEKI achieves state-of-the-art efficiency (as low as $0.05$ GPU-days) and competitive accuracy, while demonstrating strong generalization to ImageNet and multiple tasks. The approach highlights the potential of LLMs to drive NAS with minimal data prerequisites and robust performance, opening avenues for broader application and hybrid optimization strategies.

Abstract

We introduce SEKI, a novel large language model (LLM)-based neural architecture search (NAS) method. Inspired by the chain-of-thought (CoT) paradigm in modern LLMs, SEKI operates in two key stages: self-evolution and knowledge distillation. In the self-evolution stage, LLMs initially lack sufficient reference examples, so we implement an iterative refinement mechanism that enhances architectures based on performance feedback. Over time, this process accumulates a repository of high-performance architectures. In the knowledge distillation stage, LLMs analyze common patterns among these architectures to generate new, optimized designs. Combining these two stages, SEKI greatly leverages the capacity of LLMs on NAS and without requiring any domain-specific data. Experimental results show that SEKI achieves state-of-the-art (SOTA) performance across various datasets and search spaces while requiring only 0.05 GPU-days, outperforming existing methods in both efficiency and accuracy. Furthermore, SEKI demonstrates strong generalization capabilities, achieving SOTA-competitive results across multiple tasks.

SEKI: Self-Evolution and Knowledge Inspiration based Neural Architecture Search via Large Language Models

TL;DR

SEKI tackles the high cost of neural architecture search by introducing a two-stage, LLM-driven framework that requires no domain-specific data. Through Self-Evolution, SEKI iteratively refines architectures guided by performance feedback, building a knowledge repository of top designs. In Knowledge Inspiration, the LLM analyzes a subset of high-quality architectures to extract design patterns and generate new candidates, further enriching the repository. Across DARTS, NAS201, and Trans101 benchmarks, SEKI achieves state-of-the-art efficiency (as low as GPU-days) and competitive accuracy, while demonstrating strong generalization to ImageNet and multiple tasks. The approach highlights the potential of LLMs to drive NAS with minimal data prerequisites and robust performance, opening avenues for broader application and hybrid optimization strategies.

Abstract

We introduce SEKI, a novel large language model (LLM)-based neural architecture search (NAS) method. Inspired by the chain-of-thought (CoT) paradigm in modern LLMs, SEKI operates in two key stages: self-evolution and knowledge distillation. In the self-evolution stage, LLMs initially lack sufficient reference examples, so we implement an iterative refinement mechanism that enhances architectures based on performance feedback. Over time, this process accumulates a repository of high-performance architectures. In the knowledge distillation stage, LLMs analyze common patterns among these architectures to generate new, optimized designs. Combining these two stages, SEKI greatly leverages the capacity of LLMs on NAS and without requiring any domain-specific data. Experimental results show that SEKI achieves state-of-the-art (SOTA) performance across various datasets and search spaces while requiring only 0.05 GPU-days, outperforming existing methods in both efficiency and accuracy. Furthermore, SEKI demonstrates strong generalization capabilities, achieving SOTA-competitive results across multiple tasks.

Paper Structure

This paper contains 24 sections, 10 figures, 8 tables, 1 algorithm.

Figures (10)

  • Figure 1: Speed-performance comparison of our proposed SEKI with other NAS methods on CIFAR-10 (methods over 1 GPU day are not included).
  • Figure 2: Framework of SEKI. SEKI is composed of two stages: self-evolution and knowledge inspiration. In each iteration of the self-evolution, the LLM generates optimization strategies and produces a new, refined architecture by analyzing the current architecture and its performance metrics. Over successive iterations, we compile a collection of high-performing architectures throughout this process and store the top $k$ architectures in a knowledge repository. Then in knowledge inspiration, by summarizing and analyzing $\xi$ validated high-quality architectures from knowledge repository, the LLM extracts common design patterns and generates new candidate architectures.
  • Figure 3: Prompt framework for Self-Evolution.
  • Figure 4: Prompt framework for Knowledge Inspiration.
  • Figure 5: An example of Self-Evolution.
  • ...and 5 more figures