Table of Contents
Fetching ...

Deep Tabular Research via Continual Experience-Driven Execution

Junnan Dong, Chuang Zhou, Zheng Yuan, Yifei Yu, Siyu An, Di Yin, Xing Sun, Feiyue Huang

TL;DR

This work proposes a novel agentic framework that treats tabular reasoning as a closed-loop decision-making process, and carefully design a coupled query and table comprehension for path decision making and operational execution.

Abstract

Large language models often struggle with complex long-horizon analytical tasks over unstructured tables, which typically feature hierarchical and bidirectional headers and non-canonical layouts. We formalize this challenge as Deep Tabular Research (DTR), requiring multi-step reasoning over interdependent table regions. To address DTR, we propose a novel agentic framework that treats tabular reasoning as a closed-loop decision-making process. We carefully design a coupled query and table comprehension for path decision making and operational execution. Specifically, (i) DTR first constructs a hierarchical meta graph to capture bidirectional semantics, mapping natural language queries into an operation-level search space; (ii) To navigate this space, we introduce an expectation-aware selection policy that prioritizes high-utility execution paths; (iii) Crucially, historical execution outcomes are synthesized into a siamese structured memory, i.e., parameterized updates and abstracted texts, enabling continual refinement. Extensive experiments on challenging unstructured tabular benchmarks verify the effectiveness and highlight the necessity of separating strategic planning from low-level execution for long-horizon tabular reasoning.

Deep Tabular Research via Continual Experience-Driven Execution

TL;DR

This work proposes a novel agentic framework that treats tabular reasoning as a closed-loop decision-making process, and carefully design a coupled query and table comprehension for path decision making and operational execution.

Abstract

Large language models often struggle with complex long-horizon analytical tasks over unstructured tables, which typically feature hierarchical and bidirectional headers and non-canonical layouts. We formalize this challenge as Deep Tabular Research (DTR), requiring multi-step reasoning over interdependent table regions. To address DTR, we propose a novel agentic framework that treats tabular reasoning as a closed-loop decision-making process. We carefully design a coupled query and table comprehension for path decision making and operational execution. Specifically, (i) DTR first constructs a hierarchical meta graph to capture bidirectional semantics, mapping natural language queries into an operation-level search space; (ii) To navigate this space, we introduce an expectation-aware selection policy that prioritizes high-utility execution paths; (iii) Crucially, historical execution outcomes are synthesized into a siamese structured memory, i.e., parameterized updates and abstracted texts, enabling continual refinement. Extensive experiments on challenging unstructured tabular benchmarks verify the effectiveness and highlight the necessity of separating strategic planning from low-level execution for long-horizon tabular reasoning.
Paper Structure (66 sections, 6 equations, 7 figures, 6 tables, 1 algorithm)

This paper contains 66 sections, 6 equations, 7 figures, 6 tables, 1 algorithm.

Figures (7)

  • Figure 1: Existing Table QA pipelines (a) are limited to well-structured tables and shallow queries, and fail to handle unstructured tabular properties (b) and long-horizon analytical tasks (c), motivating our Deep Tabular Research.
  • Figure 2: A sketched overview for our proposed Deep Tabular Research framework for complex unstructured tabular reasoning. DTR decomposes analytical intent into meta operations, plans executable macro paths via expectation-guided search, and executes them with an experience-aware memory that records structured execution feedback and updates planning policies across iterations.
  • Figure 3: LLM call budget analysis. The blue curve shows performance (left y-axis) and the purple curve shows marginal gain (right y-axis). DTR's 4.78-call configuration red star) achieves optimal efficiency by avoiding the plateau region.
  • Figure 4: Path selection evolution across 10 batches. Colors changing from light blue to deep purple indicate both exploration and exploitation, respectively.
  • Figure 5: Employment Distribution in 1984: Agriculture (4.5%) vs Non-Agriculture (95.5%)
  • ...and 2 more figures