Intent-Aware Schema Generation And Refinement For Literature Review Tables
Vishakh Padmakumar, Joseph Chee Chang, Kyle Lo, Doug Downey, Aakanksha Naik
TL;DR
This work studies how table-intent information can guide large language models to generate and refine literature-review schemas that compare research papers. It introduces synthetic table intents and an evaluation framework to reduce ambiguity in schema generation, showing that intents improve alignment with user information needs. The study benchmarks prompting workflows and open-weight model fine-tuning, demonstrating competitive performance to black-box LLMs and revealing trade-offs between recall and precision across prompting strategies. It also proposes three schema-refinement paradigms—unguided, heuristics-guided, and critique-guided editing—and analyzes their effectiveness, with oracle critiques delivering the strongest gains. The authors release augmented datasets, fine-tuned models, and tooling to support further exploration in intent-aware schema generation and editing.
Abstract
The increasing volume of academic literature makes it essential for researchers to organize, compare, and contrast collections of documents. Large language models (LLMs) can support this process by generating schemas defining shared aspects along which to compare papers. However, progress on schema generation has been slow due to: (i) ambiguity in reference-based evaluations, and (ii) lack of editing/refinement methods. Our work is the first to address both issues. First, we present an approach for augmenting unannotated table corpora with \emph{synthesized intents}, and apply it to create a dataset for studying schema generation conditioned on a given information need, thus reducing ambiguity. With this dataset, we show how incorporating table intents significantly improves baseline performance in reconstructing reference schemas. We start by comprehensively benchmarking several single-shot schema generation methods, including prompted LLM workflows and fine-tuned models, showing that smaller, open-weight models can be fine-tuned to be competitive with state-of-the-art prompted LLMs. Next, we propose several LLM-based schema refinement techniques and show that these can further improve schemas generated by these methods.
