Sequential Large Language Model-Based Hyper-parameter Optimization

Kanan Mahammadli; Seyda Ertekin

Sequential Large Language Model-Based Hyper-parameter Optimization

Kanan Mahammadli, Seyda Ertekin

TL;DR

This work addresses the challenge of hyperparameter optimization for machine learning models by introducing SLLMBO, a sequential framework that uses large language models to adapt the search space and initialize parameters, while blending LLM-based suggestions with a Tree-structured Parzen Estimator sampler to balance exploration and exploitation. The approach is benchmarked across multiple LLMs (GPT-3.5-Turbo, GPT-4o, Claude-Sonnet, Gemini-1.5-Flash) on 14 tabular tasks, demonstrating that LLM-based initialization often improves optimization, and that the LLM-TPE sampler generally outperforms fully LLM-based methods and traditional Bayesian optimization in many settings. LangChain-based memory management further enhances stability and enables longer optimization runs, though overexploitation and API cost remain challenges. The study lays groundwork for open-source LLM benchmarking in HPO, highlights the need for reproducibility, and points to future extensions to open-source models and broader data modalities such as image and translation tasks.

Abstract

This study introduces SLLMBO, an innovative framework leveraging large language models (LLMs) for hyperparameter optimization (HPO), incorporating dynamic search space adaptability, enhanced parameter space exploitation, and a novel LLM-tree-structured parzen estimator (LLM-TPE) sampler. By addressing limitations in recent fully LLM-based methods and traditional bayesian optimization (BO), SLLMBO achieves more robust optimization. This comprehensive benchmarking evaluates multiple LLMs, including GPT-3.5-Turbo, GPT-4o, Claude-Sonnet-3.5, and Gemini-1.5-Flash, extending prior work and establishing SLLMBO as the first framework to benchmark a diverse set of LLMs for HPO. By integrating LLMs' established strengths in parameter initialization with the exploitation abilities demonstrated in this study, alongside TPE's exploration capabilities, the LLM-TPE sampler achieves a balanced exploration-exploitation trade-off, reduces API costs, and mitigates premature early stoppings for more effective parameter searches. Across 14 tabular tasks in classification and regression, the LLM-TPE sampler outperformed fully LLM-based methods and achieved superior results over BO methods in 9 tasks. Testing early stopping in budget-constrained scenarios demonstrated competitive performance, indicating that LLM-based methods generally benefit from extended iterations for optimal results. This work lays the foundation for future research exploring open-source LLMs, reproducibility of LLM results in HPO, and benchmarking SLLMBO on complex datasets, such as image classification, segmentation, and machine translation.

Sequential Large Language Model-Based Hyper-parameter Optimization

TL;DR

Abstract

Sequential Large Language Model-Based Hyper-parameter Optimization

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)