Table of Contents
Fetching ...

Are We Asking the Right Questions? On Ambiguity in Natural Language Queries for Tabular Data Analysis

Daniel Gomm, Cornelius Wolff, Madelon Hulsebos

TL;DR

Ambiguity in natural language queries for tabular data analysis is treated not as a defect but as a signal of user intent and division of labor in query grounding. The authors propose a cooperative_query_framework that distinguishes unambiguous, cooperative, and uncooperative queries, where the user and system share responsibility for grounding both the analytical procedure and the data scope. Analyzing 15 benchmarks reveals widespread data_privileged and underspecified queries, challenging traditional evaluation of execution alone and highlighting the need to separate interpretation capabilities. The paper advocates stratified evaluation, annotated datasets with grounding levels, and iterative grounding datasets, arguing for cooperative system designs that disclose grounding choices and support clarification to advance open_domain tabular data analysis.

Abstract

Natural language interfaces to tabular data must handle ambiguities inherent to queries. Instead of treating ambiguity as a deficiency, we reframe it as a feature of cooperative interaction where users are intentional about the degree to which they specify queries. We develop a principled framework based on a shared responsibility of query specification between user and system, distinguishing unambiguous and ambiguous cooperative queries, which systems can resolve through reasonable inference, from uncooperative queries that cannot be resolved. Applying the framework to evaluations for tabular question answering and analysis, we analyze the queries in 15 popular datasets, and observe an uncontrolled mixing of query types neither adequate for evaluating a system's execution accuracy nor for evaluating interpretation capabilities. This conceptualization around cooperation in resolving queries informs how to design and evaluate natural language interfaces for tabular data analysis, for which we distill concrete directions for future research and broader implications.

Are We Asking the Right Questions? On Ambiguity in Natural Language Queries for Tabular Data Analysis

TL;DR

Ambiguity in natural language queries for tabular data analysis is treated not as a defect but as a signal of user intent and division of labor in query grounding. The authors propose a cooperative_query_framework that distinguishes unambiguous, cooperative, and uncooperative queries, where the user and system share responsibility for grounding both the analytical procedure and the data scope. Analyzing 15 benchmarks reveals widespread data_privileged and underspecified queries, challenging traditional evaluation of execution alone and highlighting the need to separate interpretation capabilities. The paper advocates stratified evaluation, annotated datasets with grounding levels, and iterative grounding datasets, arguing for cooperative system designs that disclose grounding choices and support clarification to advance open_domain tabular data analysis.

Abstract

Natural language interfaces to tabular data must handle ambiguities inherent to queries. Instead of treating ambiguity as a deficiency, we reframe it as a feature of cooperative interaction where users are intentional about the degree to which they specify queries. We develop a principled framework based on a shared responsibility of query specification between user and system, distinguishing unambiguous and ambiguous cooperative queries, which systems can resolve through reasonable inference, from uncooperative queries that cannot be resolved. Applying the framework to evaluations for tabular question answering and analysis, we analyze the queries in 15 popular datasets, and observe an uncontrolled mixing of query types neither adequate for evaluating a system's execution accuracy nor for evaluating interpretation capabilities. This conceptualization around cooperation in resolving queries informs how to design and evaluate natural language interfaces for tabular data analysis, for which we distill concrete directions for future research and broader implications.

Paper Structure

This paper contains 9 sections, 2 figures, 3 tables.

Figures (2)

  • Figure 1: ⓐ: Queries with different levels of specification. ⓑ & ⓒ: Relationship of queries and actionable interpretations. ⓒ shows iterative selective grounding of the cooperative query in ⓐ, progressively narrowing the set of possible interpretations until arriving at a singular interpretation.
  • Figure 2: Analysis of query characteristics across 15 tabular benchmarks.