Table of Contents
Fetching ...

NADER: Neural Architecture Design via Multi-Agent Collaboration

Zekang Yang, Wang Zeng, Sheng Jin, Chen Qian, Ping Luo, Wentao Liu

TL;DR

The paper tackles the challenge of Neural Architecture Design (NAD) by introducing NADER, an LLM-driven, multi-agent framework that operates beyond predefined search spaces. It integrates a Research Team (Reader and Proposer) and a Development Team (Modifier and Reflector) around a graph-based neural architecture representation to iteratively improve a base network, guided by immediate and historical feedback. Key contributions include the Reflector mechanism for learning from experience, a graph-based NAD representation to focus on high-level structure, and a dedicated NAD benchmark for evaluation. Empirical results show NADER can design high-performing architectures beyond traditional NAS bounds, outperforming state-of-the-art methods and demonstrating scalability on large runs, with notable efficiency gains and generalization across datasets.

Abstract

Designing effective neural architectures poses a significant challenge in deep learning. While Neural Architecture Search (NAS) automates the search for optimal architectures, existing methods are often constrained by predetermined search spaces and may miss critical neural architectures. In this paper, we introduce NADER (Neural Architecture Design via multi-agEnt collaboRation), a novel framework that formulates neural architecture design (NAD) as a LLM-based multi-agent collaboration problem. NADER employs a team of specialized agents to enhance a base architecture through iterative modification. Current LLM-based NAD methods typically operate independently, lacking the ability to learn from past experiences, which results in repeated mistakes and inefficient exploration. To address this issue, we propose the Reflector, which effectively learns from immediate feedback and long-term experiences. Additionally, unlike previous LLM-based methods that use code to represent neural architectures, we utilize a graph-based representation. This approach allows agents to focus on design aspects without being distracted by coding. We demonstrate the effectiveness of NADER in discovering high-performing architectures beyond predetermined search spaces through extensive experiments on benchmark tasks, showcasing its advantages over state-of-the-art methods. The codes will be released soon.

NADER: Neural Architecture Design via Multi-Agent Collaboration

TL;DR

The paper tackles the challenge of Neural Architecture Design (NAD) by introducing NADER, an LLM-driven, multi-agent framework that operates beyond predefined search spaces. It integrates a Research Team (Reader and Proposer) and a Development Team (Modifier and Reflector) around a graph-based neural architecture representation to iteratively improve a base network, guided by immediate and historical feedback. Key contributions include the Reflector mechanism for learning from experience, a graph-based NAD representation to focus on high-level structure, and a dedicated NAD benchmark for evaluation. Empirical results show NADER can design high-performing architectures beyond traditional NAS bounds, outperforming state-of-the-art methods and demonstrating scalability on large runs, with notable efficiency gains and generalization across datasets.

Abstract

Designing effective neural architectures poses a significant challenge in deep learning. While Neural Architecture Search (NAS) automates the search for optimal architectures, existing methods are often constrained by predetermined search spaces and may miss critical neural architectures. In this paper, we introduce NADER (Neural Architecture Design via multi-agEnt collaboRation), a novel framework that formulates neural architecture design (NAD) as a LLM-based multi-agent collaboration problem. NADER employs a team of specialized agents to enhance a base architecture through iterative modification. Current LLM-based NAD methods typically operate independently, lacking the ability to learn from past experiences, which results in repeated mistakes and inefficient exploration. To address this issue, we propose the Reflector, which effectively learns from immediate feedback and long-term experiences. Additionally, unlike previous LLM-based methods that use code to represent neural architectures, we utilize a graph-based representation. This approach allows agents to focus on design aspects without being distracted by coding. We demonstrate the effectiveness of NADER in discovering high-performing architectures beyond predetermined search spaces through extensive experiments on benchmark tasks, showcasing its advantages over state-of-the-art methods. The codes will be released soon.
Paper Structure (34 sections, 9 figures, 16 tables, 3 algorithms)

This paper contains 34 sections, 9 figures, 16 tables, 3 algorithms.

Figures (9)

  • Figure 1: Traditional Neural Architecture Search (NAS) approaches aim to discover high-performing neural architectures within an expert-defined search space. In contrast, Neural Architecture Design (NAD) is not constrained by predefined search spaces, allowing for the exploration of entirely new architectures.
  • Figure 2: Overview of NADER framework. The Reader continuously learns from academic literature, while the Proposer identifies the most promising candidate networks and suggests modifications. The Modifier implements these suggestions, and the Reflector analyzes and provides feedback on the results. The performance of the modified network is relayed back to the Proposer, informing subsequent proposals and fostering a continuous cycle of improvement.
  • Figure 3: Illustration of graph-based neural architecture representation. Left: Visualization of DAG. Right: Text representation of DAG for LLM understanding.
  • Figure 4: Large-scale NAD experiments. Each node represents a model, and the edge indicates which model is modified based on. The darker the node color, the higher the accuracy on the test set. The root node in the center is ResNet. The path highlighted by the red arrows is the improvement path of the optimal model found.
  • Figure A1: Distribution of test accuracy of 500 models designed by NADER on CIFAR-100.
  • ...and 4 more figures