Table of Contents
Fetching ...

Automated Design Optimization via Strategic Search with Large Language Models

Anthony Carreon, Vansh Sharma, Venkat Raman

TL;DR

This paper introduces AUTO, an LLM-driven, gradient-free design optimization framework that uses a Strategist and an Implementor to explore ill-defined design spaces, demonstrated on GPU code optimization tasks (chemical kinetics and dense matrix multiplication). The approach emphasizes context-curation, strategic decision-making, and constraint-driven evaluation, achieving 50-70% search efficiency versus Bayesian optimization and competitive performance relative to expert implementations. It provides a detailed analysis of solution quality, convergence behavior, and the cost implications of LLM-based optimization, revealing both promise and current limitations, such as compilation failures and limited knowledge of niche APIs. The work suggests that automated design optimization in ill-defined spaces is feasible and scalable, with potential impact across hardware-software co-design and multi-objective engineering problems, given future enhancements in knowledge integration and stopping criteria.

Abstract

Traditional optimization methods excel in well-defined search spaces but struggle with design problems where transformations and design parameters are difficult to define. Large language models (LLMs) offer a promising alternative by dynamically interpreting design spaces and leveraging encoded domain knowledge. To this end, we introduce AUTO, an LLM agent framework that treats design optimization as a gradient-free search problem guided by strategic LLM reasoning. The framework employs two collaborative agents: a Strategist that selects between exploration and exploitation strategies, and an Implementor that executes detailed designs. Applied to GPU code optimization -- a domain critical to fields from machine learning to scientific computing -- AUTO generates solutions competitive with expert implementations for chemical kinetics integration and dense matrix multiplication. The framework achieves 50-70% search efficiency relative to Bayesian optimization methodologies. It completes optimizations in approximately 8 hours at an estimated cost of up to \$159 per run, compared to an estimated cost of up to \$480 with median-wage software developers. These findings open the door to automating design optimization in ill-defined search spaces with limited prior information.

Automated Design Optimization via Strategic Search with Large Language Models

TL;DR

This paper introduces AUTO, an LLM-driven, gradient-free design optimization framework that uses a Strategist and an Implementor to explore ill-defined design spaces, demonstrated on GPU code optimization tasks (chemical kinetics and dense matrix multiplication). The approach emphasizes context-curation, strategic decision-making, and constraint-driven evaluation, achieving 50-70% search efficiency versus Bayesian optimization and competitive performance relative to expert implementations. It provides a detailed analysis of solution quality, convergence behavior, and the cost implications of LLM-based optimization, revealing both promise and current limitations, such as compilation failures and limited knowledge of niche APIs. The work suggests that automated design optimization in ill-defined spaces is feasible and scalable, with potential impact across hardware-software co-design and multi-objective engineering problems, given future enhancements in knowledge integration and stopping criteria.

Abstract

Traditional optimization methods excel in well-defined search spaces but struggle with design problems where transformations and design parameters are difficult to define. Large language models (LLMs) offer a promising alternative by dynamically interpreting design spaces and leveraging encoded domain knowledge. To this end, we introduce AUTO, an LLM agent framework that treats design optimization as a gradient-free search problem guided by strategic LLM reasoning. The framework employs two collaborative agents: a Strategist that selects between exploration and exploitation strategies, and an Implementor that executes detailed designs. Applied to GPU code optimization -- a domain critical to fields from machine learning to scientific computing -- AUTO generates solutions competitive with expert implementations for chemical kinetics integration and dense matrix multiplication. The framework achieves 50-70% search efficiency relative to Bayesian optimization methodologies. It completes optimizations in approximately 8 hours at an estimated cost of up to \480 with median-wage software developers. These findings open the door to automating design optimization in ill-defined search spaces with limited prior information.

Paper Structure

This paper contains 15 sections, 11 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: AUTO: An LLM-based design optimization framework. AUTO functions as follows: (0) Create context from historical design data to inform subsequent iterations. (1) The Strategist selects an optimization strategy. (2) The Implementor generates designs from strategic instructions. (3) Validate designs against constraints; (4) Evaluate and score the design; (5) Record the results. The cycle repeats for $N$ iterations. After each iteration concludes, the chat histories and context windows for the Strategist and Implementor are reset.
  • Figure 2: Details of Steps 2 through 5 from Figure \ref{['fig_framework']} as it applies to GPU code optimization. The "Constrain" and "Evaluate" steps are Python blocks of code executed on one GPU, while interactions with the Implementor are LLM chats executed on a separate GPU.
  • Figure 3: Timing comparisons between the agent-optimized and human-optimized GPU code for (A) the kinetics problem by robertson1966 across different problem sizes and (B) matrix multiplication applied to two $N\times N$ matrices. See Table \ref{['tab_exp_results']} for the parameters of each run.
  • Figure 4: Code clusters and their correlations with the code runtimes for (A) kinetics run K1 and (B) matrix multiplication run M3. For each application, a "bag of words" approach produces the code vectors from the generated codes, which are then clustered using K-means. Visual embeddings are obtained using t-SNE with perplexity=10 for (A) and perplexity=15 for (B). The colors represent runtimes. The markers represent cluster memberships. The numbers next to each marker are the iteration at which the code was generated.
  • Figure 5: Search efficiency (calculated from Equation \ref{['eqn_search_eff']}) as a function of the exploration factor, $\xi$, from the upper confidence bound (UCB) acquisition function for kinetics run K1 (circles) and matrix multiplication run M3 (squares).
  • ...and 2 more figures