Table of Contents
Fetching ...

MultiTEND: A Multilingual Benchmark for Natural Language to NoSQL Query Translation

Zhiqian Qin, Yuanfeng Song, Jinwei Lu, Yuanwei Song, Shuaimin Li, Chen Jason Zhang

TL;DR

This work tackles multilingual natural language to NoSQL query generation by introducing MultiTEND, the first large-scale multilingual benchmark spanning six languages, and by proposing MultiLink, a three-component pipeline that addresses lexical and structural challenges via Intention-aware Multilingual Data Augmentation, Parallel Multilingual Sketch-Schema Prediction, and Retrieval-Augmented Chain-of-Thought Query Generation. The authors demonstrate that existing baselines underperform on multilingual settings and that MultiLink yields substantial improvements across all languages and metrics, notably in Execution Accuracy ($EX$). They validate the approach through extensive experiments, ablation studies, and parameter analyses, revealing the importance of schema linking, data augmentation, and retrieval context for robust multilingual NoSQL generation. The work advances practical multilingual NL-to-NoSQL systems and lays groundwork for extending to additional languages and cost-efficient architectures in future research.

Abstract

Natural language interfaces for NoSQL databases are increasingly vital in the big data era, enabling users to interact with complex, unstructured data without deep technical expertise. However, most recent advancements focus on English, leaving a gap for multilingual support. This paper introduces MultiTEND, the first and largest multilingual benchmark for natural language to NoSQL query generation, covering six languages: English, German, French, Russian, Japanese and Mandarin Chinese. Using MultiTEND, we analyze challenges in translating natural language to NoSQL queries across diverse linguistic structures, including lexical and syntactic differences. Experiments show that performance accuracy in both English and non-English settings remains relatively low, with a 4%-6% gap across scenarios like fine-tuned SLM, zero-shot LLM, and RAG for LLM. To address the aforementioned challenges, we introduce MultiLink, a novel framework that bridges the multilingual input to NoSQL query generation gap through a Parallel Linking Process. It breaks down the task into multiple steps, integrating parallel multilingual processing, Chain-of-Thought (CoT) reasoning, and Retrieval-Augmented Generation (RAG) to tackle lexical and structural challenges inherent in multilingual NoSQL generation. MultiLink shows enhancements in all metrics for every language against the top baseline, boosting execution accuracy by about 15% for English and averaging a 10% improvement for non-English languages.

MultiTEND: A Multilingual Benchmark for Natural Language to NoSQL Query Translation

TL;DR

This work tackles multilingual natural language to NoSQL query generation by introducing MultiTEND, the first large-scale multilingual benchmark spanning six languages, and by proposing MultiLink, a three-component pipeline that addresses lexical and structural challenges via Intention-aware Multilingual Data Augmentation, Parallel Multilingual Sketch-Schema Prediction, and Retrieval-Augmented Chain-of-Thought Query Generation. The authors demonstrate that existing baselines underperform on multilingual settings and that MultiLink yields substantial improvements across all languages and metrics, notably in Execution Accuracy (). They validate the approach through extensive experiments, ablation studies, and parameter analyses, revealing the importance of schema linking, data augmentation, and retrieval context for robust multilingual NoSQL generation. The work advances practical multilingual NL-to-NoSQL systems and lays groundwork for extending to additional languages and cost-efficient architectures in future research.

Abstract

Natural language interfaces for NoSQL databases are increasingly vital in the big data era, enabling users to interact with complex, unstructured data without deep technical expertise. However, most recent advancements focus on English, leaving a gap for multilingual support. This paper introduces MultiTEND, the first and largest multilingual benchmark for natural language to NoSQL query generation, covering six languages: English, German, French, Russian, Japanese and Mandarin Chinese. Using MultiTEND, we analyze challenges in translating natural language to NoSQL queries across diverse linguistic structures, including lexical and syntactic differences. Experiments show that performance accuracy in both English and non-English settings remains relatively low, with a 4%-6% gap across scenarios like fine-tuned SLM, zero-shot LLM, and RAG for LLM. To address the aforementioned challenges, we introduce MultiLink, a novel framework that bridges the multilingual input to NoSQL query generation gap through a Parallel Linking Process. It breaks down the task into multiple steps, integrating parallel multilingual processing, Chain-of-Thought (CoT) reasoning, and Retrieval-Augmented Generation (RAG) to tackle lexical and structural challenges inherent in multilingual NoSQL generation. MultiLink shows enhancements in all metrics for every language against the top baseline, boosting execution accuracy by about 15% for English and averaging a 10% improvement for non-English languages.

Paper Structure

This paper contains 55 sections, 11 figures, 10 tables, 1 algorithm.

Figures (11)

  • Figure 1: We developed a semi-automated pipeline to extend the monolingual dataset into a multilingual version through three steps: (1) Translation of Database Fields, where English-exclusive fields were translated using LLM-powered tools and manually verified; (2) Translation of NLQs, where NLQs were translated with few-shot prompting for semantic consistency and manually corrected; and (3) Translation of NoSQL Queries, where queries were programmatically parsed, updated with multilingual representations, and verified based on execution results. Each step combined machine-generated methods with rigorous manual verification.
  • Figure 2: The pipeline of our proposed MultiLink method. MultiLink consists of three main process: (i) Intention-aware Multilingual Data Augmentation (MIND), which enriches training data by generating diverse query pairs through multilingual synthesis and comprehensive database schema analysis; (ii) Parallel Multilingual Sketch-Schema Prediction, which parallelly maps multilingual NLQs intentions to operators and entity mentions to schema elements,including: (a) Multilingual NoSQL Sketch Generation, which generates intermediate NoSQL sketches reflecting operator mappings; (b) Monolingual Schema Linking Generation, which performs precise schema linking for each language;(iii) Retrieval-Augmented Chain-of-Thought Query Prediction, which synthesizes the final NoSQL query by integrating operator and schema mappings with multilingual context.
  • Figure 3: Parameter study
  • Figure 4: NoSQL Query Statistics in MultiTEND
  • Figure 5: (NLQ,Query) Statistics
  • ...and 6 more figures