Table of Contents
Fetching ...

MoMQ: Mixture-of-Experts Enhances Multi-Dialect Query Generation across Relational and Non-Relational Databases

Zhisheng Lin, Yifu Liu, Zhiling Luo, Jinyang Gao, Yu Li

TL;DR

MoMQ is proposed, a novel Mixture-of-Experts-based multi-dialect query generation framework across both relational and non-relational databases that employs a dialect expert group for each dialect and a multi-level routing strategy to handle dialect-specific knowledge, reducing interference during query generation.

Abstract

The improvement in translating natural language to structured query language (SQL) can be attributed to the advancements in large language models (LLMs). Open-source LLMs, tailored for specific database dialects such as MySQL, have shown great performance. However, cloud service providers are looking for a unified database manager service (e.g., Cosmos DB from Azure, Amazon Aurora from AWS, Lindorm from AlibabaCloud) that can support multiple dialects. This requirement has led to the concept of multi-dialect query generation, which presents challenges to LLMs. These challenges include syntactic differences among dialects and imbalanced data distribution across multiple dialects. To tackle these challenges, we propose MoMQ, a novel Mixture-of-Experts-based multi-dialect query generation framework across both relational and non-relational databases. MoMQ employs a dialect expert group for each dialect and a multi-level routing strategy to handle dialect-specific knowledge, reducing interference during query generation. Additionally, a shared expert group is introduced to address data imbalance, facilitating the transfer of common knowledge from high-resource dialects to low-resource ones. Furthermore, we have developed a high-quality multi-dialect query generation benchmark that covers relational and non-relational databases such as MySQL, PostgreSQL, Cypher for Neo4j, and nGQL for NebulaGraph. Extensive experiments have shown that MoMQ performs effectively and robustly even in resource-imbalanced scenarios.

MoMQ: Mixture-of-Experts Enhances Multi-Dialect Query Generation across Relational and Non-Relational Databases

TL;DR

MoMQ is proposed, a novel Mixture-of-Experts-based multi-dialect query generation framework across both relational and non-relational databases that employs a dialect expert group for each dialect and a multi-level routing strategy to handle dialect-specific knowledge, reducing interference during query generation.

Abstract

The improvement in translating natural language to structured query language (SQL) can be attributed to the advancements in large language models (LLMs). Open-source LLMs, tailored for specific database dialects such as MySQL, have shown great performance. However, cloud service providers are looking for a unified database manager service (e.g., Cosmos DB from Azure, Amazon Aurora from AWS, Lindorm from AlibabaCloud) that can support multiple dialects. This requirement has led to the concept of multi-dialect query generation, which presents challenges to LLMs. These challenges include syntactic differences among dialects and imbalanced data distribution across multiple dialects. To tackle these challenges, we propose MoMQ, a novel Mixture-of-Experts-based multi-dialect query generation framework across both relational and non-relational databases. MoMQ employs a dialect expert group for each dialect and a multi-level routing strategy to handle dialect-specific knowledge, reducing interference during query generation. Additionally, a shared expert group is introduced to address data imbalance, facilitating the transfer of common knowledge from high-resource dialects to low-resource ones. Furthermore, we have developed a high-quality multi-dialect query generation benchmark that covers relational and non-relational databases such as MySQL, PostgreSQL, Cypher for Neo4j, and nGQL for NebulaGraph. Extensive experiments have shown that MoMQ performs effectively and robustly even in resource-imbalanced scenarios.

Paper Structure

This paper contains 28 sections, 12 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 1: When it comes to user queries, different database management systems may have variations in syntax. For example, PostgreSQL uses the "FILTER" keyword, while MySQL does not. Cypher uses "MATCH" for querying, while MySQL and PostgreSQL use "SELECT".
  • Figure 2: The overall structure of MoMQ. The original feed-forward network (FFN) is transformed into a MoE structure, which consists of Shared Expert Group, Dialect Expert Group, and Multi-Level Strategy. The pre-trained weights are frozen and LoRA modules inserted into Attention and FFN are fine-tuned for rapid adaptation to multi-dialect query generation. The normalization layer is unfrozen due to its observed improvement. The Dialect Router Loss and Expert Balance Loss are added to the training objectives to adjust multi-dialect routing and mitigate routing collapse respectively.
  • Figure 3: Case study of generating nGQL and MySQL queries of different methods in the full data setting.
  • Figure 4: Expert weight distribution of generating the nGQL query. A certain number of experts are activated in each expert group and the nGQL expert group plays a dominant role in this generation process.