L2T-Tune:LLM-Guided Hybrid Database Tuning with LHS and TD3

Xinyue Yang; Chen Zheng; Yaoyang Hou; Renhao Zhang; Yinyan Zhang; Yanjun Wu; Heng Zhang

L2T-Tune:LLM-Guided Hybrid Database Tuning with LHS and TD3

Xinyue Yang, Chen Zheng, Yaoyang Hou, Renhao Zhang, Yinyan Zhang, Yanjun Wu, Heng Zhang

TL;DR

L2T-Tune introduces a three-stage LLM-guided hybrid database tuner to address high-dimensional knob spaces, weak warm-starts, and transferability. It combines uniform LHS warm-start (Stage 1), LLM-guided recommendations (Stage 2 with GPTuner or DB-BERT), and TD3 fine-tuning on RF+PCA-reduced space (Stage 3). Empirically, it achieves up to 73% improvements (average +37.1%) over strong baselines across MySQL workloads and PostgreSQL, while converging rapidly on a single server and enabling semi-transfer online tuning within ~30 steps. Practically, the framework offers fast convergence, strong generalization to hardware changes, and modest computational requirements, making it a viable option for production database tuning with limited parallel resources.

Abstract

Configuration tuning is critical for database performance. Although recent advancements in database tuning have shown promising results in throughput and latency improvement, challenges remain. First, the vast knob space makes direct optimization unstable and slow to converge. Second, reinforcement learning pipelines often lack effective warm-start guidance and require long offline training. Third, transferability is limited: when hardware or workloads change, existing models typically require substantial retraining to recover performance. To address these limitations, we propose L2T-Tune, a new LLM-guided hybrid database tuning framework that features a three-stage pipeline: Stage one performs a warm start that simultaneously generates uniform samples across the knob space and logs them into a shared pool; Stage two leverages a large language model to mine and prioritize tuning hints from manuals and community documents for rapid convergence. Stage three uses the warm-start sample pool to reduce the dimensionality of knobs and state features, then fine-tunes the configuration with the Twin Delayed Deep Deterministic Policy Gradient algorithm. We conduct experiments on L2T-Tune and the state-of-the-art models. Compared with the best-performing alternative, our approach improves performance by an average of 37.1% across all workloads, and by up to 73% on TPC-C. Compared with models trained with reinforcement learning, it achieves rapid convergence in the offline tuning stage on a single server. Moreover, during the online tuning stage, it only takes 30 steps to achieve best results.

L2T-Tune:LLM-Guided Hybrid Database Tuning with LHS and TD3

TL;DR

Abstract

L2T-Tune:LLM-Guided Hybrid Database Tuning with LHS and TD3

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (10)