Table of Contents
Fetching ...

LLM-Guided Evolution: An Autonomous Model Optimization for Object Detection

YiMing Yu, Jason Zutty

TL;DR

The paper introduces LLM-Guided Evolution (LLM-GE), an autonomous NAS framework that uses large language models to edit YAML configurations of YOLO architectures and to steer evolutionary mutations via Evolution of Thought (EoT) for object detection on KITTI. By applying LLM-GE to optimize YOLO models, the authors demonstrate significant performance gains, exemplified by rising holdout mAP@50 from 92.5% to 94.5% and by generating numerous Pareto-optimal variants that balance accuracy and inference speed. The study analyzes two seeding strategies (GE1 with yolov3 only, GE2 with multiple YOLO seeds) and documents how the search frontier and hypervolume evolve over generations, revealing trade-offs and guidance for future research. Overall, the work provides a compelling proof-of-concept for integrating LLM-driven reasoning with evolutionary search to automate architecture and hyperparameter optimization in real-world object-detection tasks, with implications for scalable AutoML and deployment efficiency.

Abstract

In machine learning, Neural Architecture Search (NAS) requires domain knowledge of model design and a large amount of trial-and-error to achieve promising performance. Meanwhile, evolutionary algorithms have traditionally relied on fixed rules and pre-defined building blocks. The Large Language Model (LLM)-Guided Evolution (GE) framework transformed this approach by incorporating LLMs to directly modify model source code for image classification algorithms on CIFAR data and intelligently guide mutations and crossovers. A key element of LLM-GE is the "Evolution of Thought" (EoT) technique, which establishes feedback loops, allowing LLMs to refine their decisions iteratively based on how previous operations performed. In this study, we perform NAS for object detection by improving LLM-GE to modify the architecture of You Only Look Once (YOLO) models to enhance performance on the KITTI dataset. Our approach intelligently adjusts the design and settings of YOLO to find the optimal algorithms against objective such as detection accuracy and speed. We show that LLM-GE produced variants with significant performance improvements, such as an increase in Mean Average Precision from 92.5% to 94.5%. This result highlights the flexibility and effectiveness of LLM-GE on real-world challenges, offering a novel paradigm for automated machine learning that combines LLM-driven reasoning with evolutionary strategies.

LLM-Guided Evolution: An Autonomous Model Optimization for Object Detection

TL;DR

The paper introduces LLM-Guided Evolution (LLM-GE), an autonomous NAS framework that uses large language models to edit YAML configurations of YOLO architectures and to steer evolutionary mutations via Evolution of Thought (EoT) for object detection on KITTI. By applying LLM-GE to optimize YOLO models, the authors demonstrate significant performance gains, exemplified by rising holdout mAP@50 from 92.5% to 94.5% and by generating numerous Pareto-optimal variants that balance accuracy and inference speed. The study analyzes two seeding strategies (GE1 with yolov3 only, GE2 with multiple YOLO seeds) and documents how the search frontier and hypervolume evolve over generations, revealing trade-offs and guidance for future research. Overall, the work provides a compelling proof-of-concept for integrating LLM-driven reasoning with evolutionary search to automate architecture and hyperparameter optimization in real-world object-detection tasks, with implications for scalable AutoML and deployment efficiency.

Abstract

In machine learning, Neural Architecture Search (NAS) requires domain knowledge of model design and a large amount of trial-and-error to achieve promising performance. Meanwhile, evolutionary algorithms have traditionally relied on fixed rules and pre-defined building blocks. The Large Language Model (LLM)-Guided Evolution (GE) framework transformed this approach by incorporating LLMs to directly modify model source code for image classification algorithms on CIFAR data and intelligently guide mutations and crossovers. A key element of LLM-GE is the "Evolution of Thought" (EoT) technique, which establishes feedback loops, allowing LLMs to refine their decisions iteratively based on how previous operations performed. In this study, we perform NAS for object detection by improving LLM-GE to modify the architecture of You Only Look Once (YOLO) models to enhance performance on the KITTI dataset. Our approach intelligently adjusts the design and settings of YOLO to find the optimal algorithms against objective such as detection accuracy and speed. We show that LLM-GE produced variants with significant performance improvements, such as an increase in Mean Average Precision from 92.5% to 94.5%. This result highlights the flexibility and effectiveness of LLM-GE on real-world challenges, offering a novel paradigm for automated machine learning that combines LLM-driven reasoning with evolutionary strategies.

Paper Structure

This paper contains 13 sections, 4 figures, 1 table.

Figures (4)

  • Figure 1: Number of Individuals in the Pareto frontier per generation.
  • Figure 2: Hypervolume for LLM-Guided Evolution of Llama-3.3-GE1 and Llama-3.3-GE2.
  • Figure 3: Parallel coordinate plot of overall Pareto front individuals evaluated on the validation Data for both Llama-3.3-GE1 and Llama-3.3-GE2
  • Figure 4: Parallel coordinate plot of overall Pareto front individuals evaluated on the holdout data for both Llama-3.3-GE1 and Llama-3.3-GE2