Natural Language Interfaces for Databases: What Do Users Think?
Panos Ipeirotis, Haotian Zheng
TL;DR
This work addresses the usability gap in NLIDBs by conducting a mixed-method user study that pits a state-of-the-art NL2SQL system (SQL-LLM) against a traditional SQL platform (Snowflake) on realistic BI tasks. The study combines quantitative measures (completion time, accuracy, reformulations) with qualitative insights (think-aloud, frustration, strategies) to reveal not only performance advantages but also how users think and adapt when using NLIDBs. Results show SQL-LLM achieving 10–30% faster task completion and 75% accuracy versus Snowflake's 50%, along with reduced reformulations and lower frustration, especially on hard queries, and a shift toward schema-first, systematic querying. The findings underscore that usability, interactive explanations, and error handling are as crucial as raw translation quality for real-world deployment of NLIDBs in business analytics. These insights guide the design of future NL2SQL systems toward transparent, adaptive, and user-centric interfaces that enable broader and more reliable data access.
Abstract
Natural Language Interfaces for Databases (NLIDBs) aim to make database querying accessible by allowing users to ask questions in everyday language rather than using formal SQL queries. Despite significant advancements in translation accuracy, critical usability challenges, such as user frustration, query refinement strategies, and error recovery, remain underexplored. To investigate these usability dimensions, we conducted a mixed-method user study comparing SQL-LLM, a state-of-the-art NL2SQL system, with Snowflake, a traditional SQL analytics platform. Our controlled evaluation involved 20 participants completing realistic database querying tasks across 12 queries each. Results show that SQL-LLM significantly reduced query completion times by 10 to 30 percent (mean: 418 s vs. 629 s, p = 0.036) and improved overall accuracy from 50 to 75 percent (p = 0.002). Additionally, participants using SQL-LLM exhibited fewer query reformulations, recovered from errors 30 to 40 seconds faster, and reported lower frustration levels compared to Snowflake users. Behavioral analysis revealed that SQL-LLM encouraged structured, schema-first querying strategies, enhancing user confidence and efficiency, particularly for complex queries. These findings underscore the practical significance of well-designed, user-friendly NLIDBs in business analytics settings, emphasizing the critical role of usability alongside technical accuracy in real-world deployments.
