Large Language Model Agent for Hyper-Parameter Optimization

Siyi Liu; Chen Gao; Yong Li

Large Language Model Agent for Hyper-Parameter Optimization

Siyi Liu, Chen Gao, Yong Li

TL;DR

AgentHPO tackles the high cost and complexity of hyperparameter optimization by introducing an LLM-driven two-agent framework (Creator and Executor) that autonomously generates, tests, and refines HP configurations using historical training logs. The Creator interprets task background to propose $H_t$ with rationales $R_t$, while the Executor trains models, analyzes results, and records outcomes, with $\mathcal{L}$ serving as a growing memory of past trials and insights toward the optimum $H^*$. Across 12 cross-domain tasks, the approach often matches or exceeds human-best performance in later trials, with GPT-4 generally delivering stronger improvements and lower variance than GPT-3.5, and providing richer explanations for decisions. By coupling explainability with autonomous optimization, AgentHPO reduces human effort in AutoML pipelines and offers a scalable, interpretable path toward robust hyperparameter tuning.

Abstract

Hyperparameter optimization is critical in modern machine learning, requiring expert knowledge, numerous trials, and high computational and human resources. Despite the advancements in Automated Machine Learning (AutoML), challenges in terms of trial efficiency, setup complexity, and interoperability still persist. To address these issues, we introduce a novel paradigm leveraging Large Language Models (LLMs) to automate hyperparameter optimization across diverse machine learning tasks, which is named AgentHPO (short for LLM Agent-based Hyperparameter Optimization). Specifically, AgentHPO processes the task information autonomously, conducts experiments with specific hyperparameters (HPs), and iteratively optimizes them based on historical trials. This human-like optimization process largely reduces the number of required trials, simplifies the setup process, and enhances interpretability and user trust, compared to traditional AutoML methods. Extensive empirical experiments conducted on 12 representative machine-learning tasks indicate that AgentHPO not only matches but also often surpasses the best human trials in terms of performance while simultaneously providing explainable results. Further analysis sheds light on the strategies employed by the LLM in optimizing these tasks, highlighting its effectiveness and adaptability in various scenarios.

Large Language Model Agent for Hyper-Parameter Optimization

TL;DR

with rationales

, while the Executor trains models, analyzes results, and records outcomes, with

serving as a growing memory of past trials and insights toward the optimum

. Across 12 cross-domain tasks, the approach often matches or exceeds human-best performance in later trials, with GPT-4 generally delivering stronger improvements and lower variance than GPT-3.5, and providing richer explanations for decisions. By coupling explainability with autonomous optimization, AgentHPO reduces human effort in AutoML pipelines and offers a scalable, interpretable path toward robust hyperparameter tuning.

Abstract

Paper Structure (29 sections, 7 figures, 1 table, 1 algorithm)

This paper contains 29 sections, 7 figures, 1 table, 1 algorithm.

Introduction
Related Works
LLM-based Autonomous Agents
LLMs for AutoML
LLMs-enhanced Model Optimization
Methodology
Creator Agent
Executor Agent
Iterative Hyperparameter Optimization
Explainable Hyperparameter Optimization
Benchmark Setting
Task descriptions
Experimental Setup
Results and Analysis
Trajectory over Trails
...and 14 more sections

Figures (7)

Figure 1: Comparative Frameworks in Hyperparameter Optimization: Human Expertise, Traditional AutoML, and LLM-Based Agents
Figure 2: Overview of our AgentHPO. The AgentHPO processes textual background information, autonomously conducts experiments with specific HPs, and iteratively optimizes them. This human-like optimization process enables AgentHPO to achieve high performance with minimal trials and provides users with an interpretable optimization solution.
Figure 3: Performance trajectory of various baselines across trials, with the X-axis indicating the trial count and the Y-axis showing the associated task metrics. To benchmark performance, we showcase the optimal outcome within 100 trials as a representation of the highest achievement attainable by human effort.
Figure 4: Link Prediction performance trajectory comparison between OPRO and AgentHPO
Figure 5: Comparison of optimization trajectories between GPT-3.5 and GPT-4
...and 2 more figures

Large Language Model Agent for Hyper-Parameter Optimization

TL;DR

Abstract

Large Language Model Agent for Hyper-Parameter Optimization

Authors

TL;DR

Abstract

Table of Contents

Figures (7)