Table of Contents
Fetching ...

StorageXTuner: An LLM Agent-Driven Automatic Tuning Framework for Heterogeneous Storage Systems

Qi Lin, Zhenyu Zhang, Viraj Thakkar, Zhenjie Sun, Mai Zheng, Zhichao Cao

TL;DR

StorageXTuner tackles the challenge of automatically tuning heterogeneous storage systems by deploying a collaborative four-agent LLM framework (Executor, Extractor, Searcher, Reflector) that splits benchmarking, data extraction, configuration search, and insight management into modular tasks. It combines an insight-driven tree search with a layered memory system to reuse validated tuning knowledge while guarding against unsafe actions, achieving robust generalization across systems and versions. The authors implement a Python prototype and demonstrate substantial performance gains across RocksDB, LevelDB, CacheLib, and InnoDB, including up to 575% throughput improvements and up to 88% p99 latency reductions, along with ablation and sensitivity analyses that highlight the value of context, insights, and closed-loop validation. By introducing new evaluation metrics and a reusable, multi-agent architecture, StorageXTuner provides a practical, scalable approach to LL-driven storage tuning with broad applicability beyond a single system or workload.

Abstract

Automatically configuring storage systems is hard: parameter spaces are large and conditions vary across workloads, deployments, and versions. Heuristic and ML tuners are often system specific, require manual glue, and degrade under changes. Recent LLM-based approaches help but usually treat tuning as a single-shot, system-specific task, which limits cross-system reuse, constrains exploration, and weakens validation. We present StorageXTuner, an LLM agent-driven auto-tuning framework for heterogeneous storage engines. StorageXTuner separates concerns across four agents - Executor (sandboxed benchmarking), Extractor (performance digest), Searcher (insight-guided configuration exploration), and Reflector (insight generation and management). The design couples an insight-driven tree search with layered memory that promotes empirically validated insights and employs lightweight checkers to guard against unsafe actions. We implement a prototype and evaluate it on RocksDB, LevelDB, CacheLib, and MySQL InnoDB with YCSB, MixGraph, and TPC-H/C. Relative to out-of-the-box settings and to ELMo-Tune, StorageXTuner reaches up to 575% and 111% higher throughput, reduces p99 latency by as much as 88% and 56%, and converges with fewer trials.

StorageXTuner: An LLM Agent-Driven Automatic Tuning Framework for Heterogeneous Storage Systems

TL;DR

StorageXTuner tackles the challenge of automatically tuning heterogeneous storage systems by deploying a collaborative four-agent LLM framework (Executor, Extractor, Searcher, Reflector) that splits benchmarking, data extraction, configuration search, and insight management into modular tasks. It combines an insight-driven tree search with a layered memory system to reuse validated tuning knowledge while guarding against unsafe actions, achieving robust generalization across systems and versions. The authors implement a Python prototype and demonstrate substantial performance gains across RocksDB, LevelDB, CacheLib, and InnoDB, including up to 575% throughput improvements and up to 88% p99 latency reductions, along with ablation and sensitivity analyses that highlight the value of context, insights, and closed-loop validation. By introducing new evaluation metrics and a reusable, multi-agent architecture, StorageXTuner provides a practical, scalable approach to LL-driven storage tuning with broad applicability beyond a single system or workload.

Abstract

Automatically configuring storage systems is hard: parameter spaces are large and conditions vary across workloads, deployments, and versions. Heuristic and ML tuners are often system specific, require manual glue, and degrade under changes. Recent LLM-based approaches help but usually treat tuning as a single-shot, system-specific task, which limits cross-system reuse, constrains exploration, and weakens validation. We present StorageXTuner, an LLM agent-driven auto-tuning framework for heterogeneous storage engines. StorageXTuner separates concerns across four agents - Executor (sandboxed benchmarking), Extractor (performance digest), Searcher (insight-guided configuration exploration), and Reflector (insight generation and management). The design couples an insight-driven tree search with layered memory that promotes empirically validated insights and employs lightweight checkers to guard against unsafe actions. We implement a prototype and evaluate it on RocksDB, LevelDB, CacheLib, and MySQL InnoDB with YCSB, MixGraph, and TPC-H/C. Relative to out-of-the-box settings and to ELMo-Tune, StorageXTuner reaches up to 575% and 111% higher throughput, reduces p99 latency by as much as 88% and 56%, and converges with fewer trials.

Paper Structure

This paper contains 20 sections, 11 figures, 6 tables.

Figures (11)

  • Figure 1: StorageXTuner framework
  • Figure 2: Automated Benchmarking and Analysis
  • Figure 3: Insight-Driven Configuration Exploration
  • Figure 4: Example Prompt for Configuration Proposal
  • Figure 5: Tuning Insight Generation and Management. The LLM updates an insight’s confidence through Upvote or Downvote based on observed results (e.g., Node 2 shows performance consistent with Insight N’s prediction, while Insight 3 contradicts it). Insights with high confidence are promoted from STM to LTM as validated, reusable knowledge, while low-confidence insights are discarded or demoted for further evaluation.
  • ...and 6 more figures