Table of Contents
Fetching ...

LeMo-NADe: Multi-Parameter Neural Architecture Discovery with LLMs

Md Hafizur Rahman, Prabuddha Chakraborty

TL;DR

LeMo-NADe tackles edge-aware neural architecture search by removing reliance on predefined search spaces and instead using an expert-system–driven instruction set to steer LLMs (GPT-4 Turbo and Gemini) in iterative architecture generation, training, and evaluation. It formalizes a multi-parameter objective through the combined model effectiveness metric and demonstrates rapid discovery of competitive networks on CIFAR-10, CIFAR-100, and ImageNet16-120 with notable reductions in training energy and CO2 emissions. The results show that LLM backends can produce high-accuracy models under diverse priorities, enabling adaptable NAS for resource-constrained environments, while also revealing backend-specific strengths and limitations. This work suggests a new paradigm for AI-guided AI design, with future directions including training specialized LLMs tailored to architecture search tasks and broader application domains.

Abstract

Building efficient neural network architectures can be a time-consuming task requiring extensive expert knowledge. This task becomes particularly challenging for edge devices because one has to consider parameters such as power consumption during inferencing, model size, inferencing speed, and CO2 emissions. In this article, we introduce a novel framework designed to automatically discover new neural network architectures based on user-defined parameters, an expert system, and an LLM trained on a large amount of open-domain knowledge. The introduced framework (LeMo-NADe) is tailored to be used by non-AI experts, does not require a predetermined neural architecture search space, and considers a large set of edge device-specific parameters. We implement and validate this proposed neural architecture discovery framework using CIFAR-10, CIFAR-100, and ImageNet16-120 datasets while using GPT-4 Turbo and Gemini as the LLM component. We observe that the proposed framework can rapidly (within hours) discover intricate neural network models that perform extremely well across a diverse set of application settings defined by the user.

LeMo-NADe: Multi-Parameter Neural Architecture Discovery with LLMs

TL;DR

LeMo-NADe tackles edge-aware neural architecture search by removing reliance on predefined search spaces and instead using an expert-system–driven instruction set to steer LLMs (GPT-4 Turbo and Gemini) in iterative architecture generation, training, and evaluation. It formalizes a multi-parameter objective through the combined model effectiveness metric and demonstrates rapid discovery of competitive networks on CIFAR-10, CIFAR-100, and ImageNet16-120 with notable reductions in training energy and CO2 emissions. The results show that LLM backends can produce high-accuracy models under diverse priorities, enabling adaptable NAS for resource-constrained environments, while also revealing backend-specific strengths and limitations. This work suggests a new paradigm for AI-guided AI design, with future directions including training specialized LLMs tailored to architecture search tasks and broader application domains.

Abstract

Building efficient neural network architectures can be a time-consuming task requiring extensive expert knowledge. This task becomes particularly challenging for edge devices because one has to consider parameters such as power consumption during inferencing, model size, inferencing speed, and CO2 emissions. In this article, we introduce a novel framework designed to automatically discover new neural network architectures based on user-defined parameters, an expert system, and an LLM trained on a large amount of open-domain knowledge. The introduced framework (LeMo-NADe) is tailored to be used by non-AI experts, does not require a predetermined neural architecture search space, and considers a large set of edge device-specific parameters. We implement and validate this proposed neural architecture discovery framework using CIFAR-10, CIFAR-100, and ImageNet16-120 datasets while using GPT-4 Turbo and Gemini as the LLM component. We observe that the proposed framework can rapidly (within hours) discover intricate neural network models that perform extremely well across a diverse set of application settings defined by the user.
Paper Structure (21 sections, 3 equations, 5 figures, 10 tables, 3 algorithms)

This paper contains 21 sections, 3 equations, 5 figures, 10 tables, 3 algorithms.

Figures (5)

  • Figure 1: Overview of the LeMo-NADe framework.
  • Figure 2: Qualitative analysis of the effectiveness of LeMo-NADe with GPT-4 Turbo and Gemini backends.
  • Figure 3: Model generated by GPT-4 Turbo for CIFAR-10 dataset with temperature 0.6 and setting 3, and it gives 90.90% test accuracy.
  • Figure 4: Model generated by GPT-4 Turbo for CIFAR-100 dataset with temperature 0.4 and setting 1, and it gives 57.74% test accuracy.
  • Figure 5: Model generated by GPT-4 Turbo for ImageNet16-120 dataset with temperature 0.4 and setting 1, and it gives 25.97% test accuracy.