SEKI: Self-Evolution and Knowledge Inspiration based Neural Architecture Search via Large Language Models

Zicheng Cai; Yaohua Tang; Yutao Lai; Hua Wang; Zhi Chen; Hao Chen

SEKI: Self-Evolution and Knowledge Inspiration based Neural Architecture Search via Large Language Models

Zicheng Cai, Yaohua Tang, Yutao Lai, Hua Wang, Zhi Chen, Hao Chen

TL;DR

SEKI tackles the high cost of neural architecture search by introducing a two-stage, LLM-driven framework that requires no domain-specific data. Through Self-Evolution, SEKI iteratively refines architectures guided by performance feedback, building a knowledge repository of top designs. In Knowledge Inspiration, the LLM analyzes a subset of high-quality architectures to extract design patterns and generate new candidates, further enriching the repository. Across DARTS, NAS201, and Trans101 benchmarks, SEKI achieves state-of-the-art efficiency (as low as $0.05$ GPU-days) and competitive accuracy, while demonstrating strong generalization to ImageNet and multiple tasks. The approach highlights the potential of LLMs to drive NAS with minimal data prerequisites and robust performance, opening avenues for broader application and hybrid optimization strategies.

Abstract

We introduce SEKI, a novel large language model (LLM)-based neural architecture search (NAS) method. Inspired by the chain-of-thought (CoT) paradigm in modern LLMs, SEKI operates in two key stages: self-evolution and knowledge distillation. In the self-evolution stage, LLMs initially lack sufficient reference examples, so we implement an iterative refinement mechanism that enhances architectures based on performance feedback. Over time, this process accumulates a repository of high-performance architectures. In the knowledge distillation stage, LLMs analyze common patterns among these architectures to generate new, optimized designs. Combining these two stages, SEKI greatly leverages the capacity of LLMs on NAS and without requiring any domain-specific data. Experimental results show that SEKI achieves state-of-the-art (SOTA) performance across various datasets and search spaces while requiring only 0.05 GPU-days, outperforming existing methods in both efficiency and accuracy. Furthermore, SEKI demonstrates strong generalization capabilities, achieving SOTA-competitive results across multiple tasks.

SEKI: Self-Evolution and Knowledge Inspiration based Neural Architecture Search via Large Language Models

TL;DR

Abstract

SEKI: Self-Evolution and Knowledge Inspiration based Neural Architecture Search via Large Language Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (10)