Structsum Generation for Faster Text Comprehension
Parag Jain, Andreea Marzoca, Francesco Piccinno
TL;DR
StructSum advances rapid text comprehension by transforming passages into structured representations (tables and mind maps) using divide-and-generate and iterative prompting strategies, respectively. It introduces three output critics—Factuality, Local Structure, and Global Structure—and an Auto-QA framework to quantify semantic coverage, validated against SQuAD and via a user-text comprehension study. The approach, built on PaLM-2 Unicorn, demonstrates substantial improvements in output quality and significant reductions in time needed to answer questions when using StructSums, with notable gains for mind maps (+37pp) and tables (+15pp) and faster comprehension in controlled user studies. Together, these contributions establish a scalable evaluation and generation pipeline for multi-modal structured outputs and highlight practical benefits for information-seeking tasks.
Abstract
We consider the task of generating structured representations of text using large language models (LLMs). We focus on tables and mind maps as representative modalities. Tables are more organized way of representing data, while mind maps provide a visually dynamic and flexible approach, particularly suitable for sparse content. Despite the effectiveness of LLMs on different tasks, we show that current models struggle with generating structured outputs. In response, we present effective prompting strategies for both of these tasks. We introduce a taxonomy of problems around factuality, global and local structure, common to both modalities and propose a set of critiques to tackle these issues resulting in an absolute improvement in accuracy of +37pp (79%) for mind maps and +15pp (78%) for tables. To evaluate semantic coverage of generated structured representations we propose Auto-QA, and we verify the adequacy of Auto-QA using SQuAD dataset. We further evaluate the usefulness of structured representations via a text comprehension user study. The results show a significant reduction in comprehension time compared to text when using table (42.9%) and mind map (31.9%), without loss in accuracy.
