Table of Contents
Fetching ...

Can LLMs substitute SQL? Comparing Resource Utilization of Querying LLMs versus Traditional Relational Databases

Xiang Zhang, Khatoon Khedri, Reza Rawassizadeh

TL;DR

The paper investigates whether large language models can substitute traditional SQL for querying tabular data by benchmarking nine open-source LLMs (7B–34B) against SQLite on a small stock-transaction dataset. It measures execution time, memory, and energy, along with accuracy for both NL-to-SQL generation and direct NL answers, revealing substantial energy overhead and modest accuracy gains for LLMs relative to native SQL. Larger models improve accuracy but consume significantly more energy, and quantized variants offer some gains in efficiency at the cost of scalability and accuracy. The authors advocate against replacing relational databases with LLMs in current practice and suggest hybrid approaches that integrate LLM capabilities with conventional SQL parsing to balance accessibility and efficiency.

Abstract

Large Language Models (LLMs) can automate or substitute different types of tasks in the software engineering process. This study evaluates the resource utilization and accuracy of LLM in interpreting and executing natural language queries against traditional SQL within relational database management systems. We empirically examine the resource utilization and accuracy of nine LLMs varying from 7 to 34 Billion parameters, including Llama2 7B, Llama2 13B, Mistral, Mixtral, Optimus-7B, SUS-chat-34B, platypus-yi-34b, NeuralHermes-2.5-Mistral-7B and Starling-LM-7B-alpha, using a small transaction dataset. Our findings indicate that using LLMs for database queries incurs significant energy overhead (even small and quantized models), making it an environmentally unfriendly approach. Therefore, we advise against replacing relational databases with LLMs due to their substantial resource utilization.

Can LLMs substitute SQL? Comparing Resource Utilization of Querying LLMs versus Traditional Relational Databases

TL;DR

The paper investigates whether large language models can substitute traditional SQL for querying tabular data by benchmarking nine open-source LLMs (7B–34B) against SQLite on a small stock-transaction dataset. It measures execution time, memory, and energy, along with accuracy for both NL-to-SQL generation and direct NL answers, revealing substantial energy overhead and modest accuracy gains for LLMs relative to native SQL. Larger models improve accuracy but consume significantly more energy, and quantized variants offer some gains in efficiency at the cost of scalability and accuracy. The authors advocate against replacing relational databases with LLMs in current practice and suggest hybrid approaches that integrate LLM capabilities with conventional SQL parsing to balance accessibility and efficiency.

Abstract

Large Language Models (LLMs) can automate or substitute different types of tasks in the software engineering process. This study evaluates the resource utilization and accuracy of LLM in interpreting and executing natural language queries against traditional SQL within relational database management systems. We empirically examine the resource utilization and accuracy of nine LLMs varying from 7 to 34 Billion parameters, including Llama2 7B, Llama2 13B, Mistral, Mixtral, Optimus-7B, SUS-chat-34B, platypus-yi-34b, NeuralHermes-2.5-Mistral-7B and Starling-LM-7B-alpha, using a small transaction dataset. Our findings indicate that using LLMs for database queries incurs significant energy overhead (even small and quantized models), making it an environmentally unfriendly approach. Therefore, we advise against replacing relational databases with LLMs due to their substantial resource utilization.
Paper Structure (15 sections, 2 figures, 5 tables)

This paper contains 15 sections, 2 figures, 5 tables.

Figures (2)

  • Figure 1: The average energy consumption (J) for direct query results of LLM models
  • Figure 2: The average energy consumption (J) for SQL query generation of LLM models