Table of Contents
Fetching ...

Natural Language Interfaces for Databases: What Do Users Think?

Panos Ipeirotis, Haotian Zheng

TL;DR

This work addresses the usability gap in NLIDBs by conducting a mixed-method user study that pits a state-of-the-art NL2SQL system (SQL-LLM) against a traditional SQL platform (Snowflake) on realistic BI tasks. The study combines quantitative measures (completion time, accuracy, reformulations) with qualitative insights (think-aloud, frustration, strategies) to reveal not only performance advantages but also how users think and adapt when using NLIDBs. Results show SQL-LLM achieving 10–30% faster task completion and 75% accuracy versus Snowflake's 50%, along with reduced reformulations and lower frustration, especially on hard queries, and a shift toward schema-first, systematic querying. The findings underscore that usability, interactive explanations, and error handling are as crucial as raw translation quality for real-world deployment of NLIDBs in business analytics. These insights guide the design of future NL2SQL systems toward transparent, adaptive, and user-centric interfaces that enable broader and more reliable data access.

Abstract

Natural Language Interfaces for Databases (NLIDBs) aim to make database querying accessible by allowing users to ask questions in everyday language rather than using formal SQL queries. Despite significant advancements in translation accuracy, critical usability challenges, such as user frustration, query refinement strategies, and error recovery, remain underexplored. To investigate these usability dimensions, we conducted a mixed-method user study comparing SQL-LLM, a state-of-the-art NL2SQL system, with Snowflake, a traditional SQL analytics platform. Our controlled evaluation involved 20 participants completing realistic database querying tasks across 12 queries each. Results show that SQL-LLM significantly reduced query completion times by 10 to 30 percent (mean: 418 s vs. 629 s, p = 0.036) and improved overall accuracy from 50 to 75 percent (p = 0.002). Additionally, participants using SQL-LLM exhibited fewer query reformulations, recovered from errors 30 to 40 seconds faster, and reported lower frustration levels compared to Snowflake users. Behavioral analysis revealed that SQL-LLM encouraged structured, schema-first querying strategies, enhancing user confidence and efficiency, particularly for complex queries. These findings underscore the practical significance of well-designed, user-friendly NLIDBs in business analytics settings, emphasizing the critical role of usability alongside technical accuracy in real-world deployments.

Natural Language Interfaces for Databases: What Do Users Think?

TL;DR

This work addresses the usability gap in NLIDBs by conducting a mixed-method user study that pits a state-of-the-art NL2SQL system (SQL-LLM) against a traditional SQL platform (Snowflake) on realistic BI tasks. The study combines quantitative measures (completion time, accuracy, reformulations) with qualitative insights (think-aloud, frustration, strategies) to reveal not only performance advantages but also how users think and adapt when using NLIDBs. Results show SQL-LLM achieving 10–30% faster task completion and 75% accuracy versus Snowflake's 50%, along with reduced reformulations and lower frustration, especially on hard queries, and a shift toward schema-first, systematic querying. The findings underscore that usability, interactive explanations, and error handling are as crucial as raw translation quality for real-world deployment of NLIDBs in business analytics. These insights guide the design of future NL2SQL systems toward transparent, adaptive, and user-centric interfaces that enable broader and more reliable data access.

Abstract

Natural Language Interfaces for Databases (NLIDBs) aim to make database querying accessible by allowing users to ask questions in everyday language rather than using formal SQL queries. Despite significant advancements in translation accuracy, critical usability challenges, such as user frustration, query refinement strategies, and error recovery, remain underexplored. To investigate these usability dimensions, we conducted a mixed-method user study comparing SQL-LLM, a state-of-the-art NL2SQL system, with Snowflake, a traditional SQL analytics platform. Our controlled evaluation involved 20 participants completing realistic database querying tasks across 12 queries each. Results show that SQL-LLM significantly reduced query completion times by 10 to 30 percent (mean: 418 s vs. 629 s, p = 0.036) and improved overall accuracy from 50 to 75 percent (p = 0.002). Additionally, participants using SQL-LLM exhibited fewer query reformulations, recovered from errors 30 to 40 seconds faster, and reported lower frustration levels compared to Snowflake users. Behavioral analysis revealed that SQL-LLM encouraged structured, schema-first querying strategies, enhancing user confidence and efficiency, particularly for complex queries. These findings underscore the practical significance of well-designed, user-friendly NLIDBs in business analytics settings, emphasizing the critical role of usability alongside technical accuracy in real-world deployments.

Paper Structure

This paper contains 34 sections, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Completion Time by Difficulty Level: Mean query completion time (s) by difficulty level (Easy, Medium, Hard).
  • Figure 2: Success Rate by Difficulty Level: Success rates (%) by query difficulty level for SQL-LLM and Snowflake(Easy, Medium, Hard).
  • Figure 3: Learning Curve: Completion Time Over Tasks - Learning curve showing mean completion time (s) across the sequence of 12 tasks.
  • Figure 4: Query Reformulations by Difficulty Level: Average number of query reformulations per query by difficulty level.
  • Figure 5: Frustration and Recovery by Difficulty Level: Mean frustration levels (self-reported on a 5-point Likert scale) and recovery times (seconds) following errors, by query difficulty.
  • ...and 1 more figures