Table of Contents
Fetching ...

Automatic Database Configuration Debugging using Retrieval-Augmented Language Models

Sibei Chen, Ju Fan, Bin Wu, Nan Tang, Chao Deng, Pengyi Wang, Ye Li, Jian Tan, Feifei Li, Jingren Zhou, Xiaoyong Du

TL;DR

Andromeda introduces a retrieval-augmented framework for automatic DBMS configuration debugging, enabling NL questions about configurations to be diagnosed and solved by outputting concrete knob-value recommendations. It combines offline contrastive document representation learning with online multi-source retrieval (historical questions and troubleshooting manuals) and telemetry-aware reasoning, guided by two-phase prompting to identify relevant knobs and their values. The approach is augmented by a logic-chain data synthesis strategy and a telemetry analysis pipeline (anomalous telemetry detection and telemetry-to-text conversion), achieving superior performance over PLM and non-RAG baselines on real-world datasets. The results show high accuracy in knob diagnosis, effective document retrieval, scalable telemetry processing, and practical cost efficiency, indicating strong potential for reducing DBA workload in production environments.

Abstract

Database management system (DBMS) configuration debugging, e.g., diagnosing poorly configured DBMS knobs and generating troubleshooting recommendations, is crucial in optimizing DBMS performance. However, the configuration debugging process is tedious and, sometimes challenging, even for seasoned database administrators (DBAs) with sufficient experience in DBMS configurations and good understandings of the DBMS internals (e.g., MySQL or Oracle). To address this difficulty, we propose Andromeda, a framework that utilizes large language models (LLMs) to enable automatic DBMS configuration debugging. Andromeda serves as a natural surrogate of DBAs to answer a wide range of natural language (NL) questions on DBMS configuration issues, and to generate diagnostic suggestions to fix these issues. Nevertheless, directly prompting LLMs with these professional questions may result in overly generic and often unsatisfying answers. To this end, we propose a retrieval-augmented generation (RAG) strategy that effectively provides matched domain-specific contexts for the question from multiple sources. They come from related historical questions, troubleshooting manuals and DBMS telemetries, which significantly improve the performance of configuration debugging. To support the RAG strategy, we develop a document retrieval mechanism addressing heterogeneous documents and design an effective method for telemetry analysis. Extensive experiments on real-world DBMS configuration debugging datasets show that Andromeda significantly outperforms existing solutions.

Automatic Database Configuration Debugging using Retrieval-Augmented Language Models

TL;DR

Andromeda introduces a retrieval-augmented framework for automatic DBMS configuration debugging, enabling NL questions about configurations to be diagnosed and solved by outputting concrete knob-value recommendations. It combines offline contrastive document representation learning with online multi-source retrieval (historical questions and troubleshooting manuals) and telemetry-aware reasoning, guided by two-phase prompting to identify relevant knobs and their values. The approach is augmented by a logic-chain data synthesis strategy and a telemetry analysis pipeline (anomalous telemetry detection and telemetry-to-text conversion), achieving superior performance over PLM and non-RAG baselines on real-world datasets. The results show high accuracy in knob diagnosis, effective document retrieval, scalable telemetry processing, and practical cost efficiency, indicating strong potential for reducing DBA workload in production environments.

Abstract

Database management system (DBMS) configuration debugging, e.g., diagnosing poorly configured DBMS knobs and generating troubleshooting recommendations, is crucial in optimizing DBMS performance. However, the configuration debugging process is tedious and, sometimes challenging, even for seasoned database administrators (DBAs) with sufficient experience in DBMS configurations and good understandings of the DBMS internals (e.g., MySQL or Oracle). To address this difficulty, we propose Andromeda, a framework that utilizes large language models (LLMs) to enable automatic DBMS configuration debugging. Andromeda serves as a natural surrogate of DBAs to answer a wide range of natural language (NL) questions on DBMS configuration issues, and to generate diagnostic suggestions to fix these issues. Nevertheless, directly prompting LLMs with these professional questions may result in overly generic and often unsatisfying answers. To this end, we propose a retrieval-augmented generation (RAG) strategy that effectively provides matched domain-specific contexts for the question from multiple sources. They come from related historical questions, troubleshooting manuals and DBMS telemetries, which significantly improve the performance of configuration debugging. To support the RAG strategy, we develop a document retrieval mechanism addressing heterogeneous documents and design an effective method for telemetry analysis. Extensive experiments on real-world DBMS configuration debugging datasets show that Andromeda significantly outperforms existing solutions.

Paper Structure

This paper contains 21 sections, 6 equations, 13 figures, 9 tables.

Figures (13)

  • Figure 1: Overview of automatic DBMS configuration debugging, where users directly pose NL debugging questions regarding configuration issues, and there exists a "co-pilot" to diagnose the issues and generate recommendations to fix the issues.
  • Figure 2: An example of our RAG strategy in Andromeda. (a) A straightforward strategy that directly prompts an LLM with the NL question results in overly-generic yet useless answers. (b) Our RAG strategy provides domain-specific context of an NL debugging question from multiple sources that improve the inference capabilities of the LLM on configuration debugging.
  • Figure 3: An overview of Andromeda. (a) Offline: Andromeda learns representation for heterogeneous documents and stores the document embeddings in a vector database. (b) Online: Andromeda utilizes an RAG-based configuration debugging strategy to recommend configurations for an NL question $q$ and a database $D$.
  • Figure 4: An overview of document representation learning.
  • Figure 5: An overview of training data augmentation.
  • ...and 8 more figures

Theorems & Definitions (5)

  • Example 1
  • Example 2
  • Example 3: Straightforward Data Synthesis
  • Example 4: Our Logic-Chain based Synthesis Strategy.
  • Example 5