H-STAR: LLM-driven Hybrid SQL-Text Adaptive Reasoning on Tables

Nikhil Abhyankar; Vivek Gupta; Dan Roth; Chandan K. Reddy

H-STAR: LLM-driven Hybrid SQL-Text Adaptive Reasoning on Tables

Nikhil Abhyankar, Vivek Gupta, Dan Roth, Chandan K. Reddy

TL;DR

H-STAR addresses the core challenge of tabular reasoning by integrating semantic language understanding with symbolic computation in a fixed two-stage pipeline. It first performs multi-view table extraction to produce a query-specific, compact table, then employs adaptive reasoning that uses semantic methods for direct lookups and lexical questions while invoking SQL-based reasoning for quantitative tasks, with SQL-derived evidence feeding the final textual reasoning. Across TabFact, WikiTQ, and FeTaQA, H-STAR consistently outperforms state-of-the-art baselines and demonstrates robustness across multiple LLMs, while ablation studies confirm the essential contribution of both the extraction and adaptive reasoning stages. The approach also delivers practical efficiency gains through targeted extraction, better handling of longer tables, and a manageable generation budget, suggesting strong potential for scalable, accurate tabular QA in real-world settings.

Abstract

Tabular reasoning involves interpreting natural language queries about tabular data, which presents a unique challenge of combining language understanding with structured data analysis. Existing methods employ either textual reasoning, which excels in semantic interpretation but struggles with mathematical operations, or symbolic reasoning, which handles computations well but lacks semantic understanding. This paper introduces a novel algorithm H-STAR that integrates both symbolic and semantic (textual) approaches in a two-stage process to address these limitations. H-STAR employs: (1) step-wise table extraction using `multi-view' column retrieval followed by row extraction, and (2) adaptive reasoning that adapts reasoning strategies based on question types, utilizing semantic reasoning for direct lookup and complex lexical queries while augmenting textual reasoning with symbolic reasoning support for quantitative and logical tasks. Our extensive experiments demonstrate that H-STAR significantly outperforms state-of-the-art methods across three tabular question-answering (QA) and fact-verification datasets, underscoring its effectiveness and efficiency.

H-STAR: LLM-driven Hybrid SQL-Text Adaptive Reasoning on Tables

TL;DR

Abstract

Paper Structure (45 sections, 21 figures, 16 tables, 1 algorithm)

This paper contains 45 sections, 21 figures, 16 tables, 1 algorithm.

Introduction
H-STAR Approach
Table Extraction
Column Extraction.
Row Extraction.
Adaptive Reasoning
Experiments
Benchmark Datasets.
Evaluation Metrics.
LLM Models.
Baseline Methods.
Main Results
Comparison Across Methods.
Performance Across LLMs.
Efficiency Analysis
...and 30 more sections

Figures (21)

Figure 1: An illustration of different tabular reasoning tasks (a) Fact-verification, (b) Short-form QA, and (c) Long-form QA. For each task, question Q is paired with its answer A, which varies by task. Evidence shows the relevant columns and rows needed to answer the question.
Figure 2: An illustration highlighting the complexity of table reasoning and the need for an integrated approach: (a) Original table, (b) Symbolic reasoning misinterprets the question and returns a value instead of a yes/no response, and (c) Text-based approach fails to solve a math question correctly, leading to an incorrect answer.
Figure 3: An overview of H-STAR, consisting of a combination of code generation and text-based verification. Given a complex table and its question, H-STAR answers using (a) Table Extraction: extracts the question-specific table from the original by first selecting the columns followed by rows. (b) Adaptive Reasoning: when the question has any mathematical component, it generates an additional table using SQL used in the textual reasoning step.
Figure 4: Comparison of average table cells in the final table.
Figure 5: Error distribution on 100 error samples across datasets for H-STAR (GPT-3.5-Turbo).
...and 16 more figures

H-STAR: LLM-driven Hybrid SQL-Text Adaptive Reasoning on Tables

TL;DR

Abstract

H-STAR: LLM-driven Hybrid SQL-Text Adaptive Reasoning on Tables

Authors

TL;DR

Abstract

Table of Contents

Figures (21)