Table of Contents
Fetching ...

From Heuristic Selection to Automated Algorithm Design: LLMs Benefit from Strong Priors

Qi Huang, Furong Ye, Ananta Shahane, Thomas Bäck, Niki van Stein

TL;DR

It is shown that providing high-quality algorithmic code examples can substantially improve the performance of the LLM-driven optimization and highlight the value of integrating benchmarking studies to enhance both efficiency and robustness of the LLM-driven black-box optimization methods.

Abstract

Large Language Models (LLMs) have already been widely adopted for automated algorithm design, demonstrating strong abilities in generating and evolving algorithms across various fields. Existing work has largely focused on examining their effectiveness in solving specific problems, with search strategies primarily guided by adaptive prompt designs. In this paper, through investigating the token-wise attribution of the prompts to LLM-generated algorithmic codes, we show that providing high-quality algorithmic code examples can substantially improve the performance of the LLM-driven optimization. Building upon this insight, we propose leveraging prior benchmark algorithms to guide LLM-driven optimization and demonstrate superior performance on two black-box optimization benchmarks: the pseudo-Boolean optimization suite (pbo) and the black-box optimization suite (bbob). Our findings highlight the value of integrating benchmarking studies to enhance both efficiency and robustness of the LLM-driven black-box optimization methods.

From Heuristic Selection to Automated Algorithm Design: LLMs Benefit from Strong Priors

TL;DR

It is shown that providing high-quality algorithmic code examples can substantially improve the performance of the LLM-driven optimization and highlight the value of integrating benchmarking studies to enhance both efficiency and robustness of the LLM-driven black-box optimization methods.

Abstract

Large Language Models (LLMs) have already been widely adopted for automated algorithm design, demonstrating strong abilities in generating and evolving algorithms across various fields. Existing work has largely focused on examining their effectiveness in solving specific problems, with search strategies primarily guided by adaptive prompt designs. In this paper, through investigating the token-wise attribution of the prompts to LLM-generated algorithmic codes, we show that providing high-quality algorithmic code examples can substantially improve the performance of the LLM-driven optimization. Building upon this insight, we propose leveraging prior benchmark algorithms to guide LLM-driven optimization and demonstrate superior performance on two black-box optimization benchmarks: the pseudo-Boolean optimization suite (pbo) and the black-box optimization suite (bbob). Our findings highlight the value of integrating benchmarking studies to enhance both efficiency and robustness of the LLM-driven black-box optimization methods.
Paper Structure (35 sections, 12 equations, 18 figures, 12 tables, 1 algorithm)

This paper contains 35 sections, 12 equations, 18 figures, 12 tables, 1 algorithm.

Figures (18)

  • Figure 1: The workflow of LLM-driven optimization approaches
  • Figure 2: The heatmap of the token-wise relevance of a given prompt to its corresponding newly generated algorithmic code. The result is obtained on an instruction-tuned 27b Gemma 3 LLM using the AttnLRP explainer. Darker shading indicates higher aggregated relevance scores ($R_i \in [0, 1]$ in Equation \ref{['eq: aggregate relevance']}), thus more important.
  • Figure 3: Convergence process of refinement-only LHNS and LLaMEA methods for the OneMax problem. $x$-axis represents the number of algorithms generated by the LLM, and $y$-axis indicates the best-so-far AUC value. Results are from using Gemini, GPT, and Qwen, respectively (from Left to Right).
  • Figure 4: Boxplots of the best normalized AUC values obtained by the six approaches on 23 problems of pbo (Top) and 24 problems of bbob (Bottom). The LLM-driven approaches have been tested on Gemini, GPT, and Qwen, respectively (from Left to Right). MCTS-AHD is denoted as MCTS for conciseness.
  • Figure 5: An illustration of the contribution of queries refining prior benchmark algorithms in BAG. The corresponding obtained results are marked by stars, and green indicates triggering improvements. Results are from using Gemini.
  • ...and 13 more figures