Table of Contents
Fetching ...

Fine-Grained Table Retrieval Through the Lens of Complex Queries

Wojciech Kosiuk, Xingyu Ji, Yeounoh Chung, Fatma Özcan, Madelon Hulsebos

TL;DR

This work presents and study a table retrieval mechanism devising fine-grained typed query decomposition and global connectivity-awareness (DCTR), to handle the challenges induced by open-domain question answering over relational databases in complex usage contexts.

Abstract

Enabling question answering over tables and databases in natural language has become a key capability in the democratization of insights from tabular data sources. These systems first require retrieval of data that is relevant to a given natural language query, for which several methods have been introduced. In this work we present and study a table retrieval mechanism devising fine-grained typed query decomposition and global connectivity-awareness (DCTR), to handle the challenges induced by open-domain question answering over relational databases in complex usage contexts. We evaluate the effectiveness of the two mechanisms through the lens of retrieval complexity which we measure along the axes of query- and data complexity. Our analyses over industry-aligned benchmarks illustrate the robustness of DCTR for highly composite queries and densely connected databases.

Fine-Grained Table Retrieval Through the Lens of Complex Queries

TL;DR

This work presents and study a table retrieval mechanism devising fine-grained typed query decomposition and global connectivity-awareness (DCTR), to handle the challenges induced by open-domain question answering over relational databases in complex usage contexts.

Abstract

Enabling question answering over tables and databases in natural language has become a key capability in the democratization of insights from tabular data sources. These systems first require retrieval of data that is relevant to a given natural language query, for which several methods have been introduced. In this work we present and study a table retrieval mechanism devising fine-grained typed query decomposition and global connectivity-awareness (DCTR), to handle the challenges induced by open-domain question answering over relational databases in complex usage contexts. We evaluate the effectiveness of the two mechanisms through the lens of retrieval complexity which we measure along the axes of query- and data complexity. Our analyses over industry-aligned benchmarks illustrate the robustness of DCTR for highly composite queries and densely connected databases.
Paper Structure (37 sections, 9 figures, 7 tables)

This paper contains 37 sections, 9 figures, 7 tables.

Figures (9)

  • Figure 1: Overview of DCTR with component-wise retrieval, group formation, and FK-group expansion.
  • Figure 2: Query length vs recall@25, aggregated across datasets. Comparison through three embedding models, shows that DCTR outperforms the baseline, especially for longer queries ($\geq$40 tokens).
  • Figure 3: Number of components vs Recall@25 across datasets. Comparison through three embedding models shows that DCTR is robust for compositional queries while a single-vector baseline does not resolve compositional queries regardless of model capacity.
  • Figure 4: Number of tables connected to gold tables vs recall@25, combined across datasets, for different embedding models. DCTR improves more on queries where gold tables are densely connected.
  • Figure 5: Query length vs recall@k, combined across datasets. Comparison of three embedding models for the proposed method versus the baseline. Across all values of $k$, the proposed method consistently outperforms the baseline, especially for longer queries ($\geq$40 tokens).
  • ...and 4 more figures