Table of Contents
Fetching ...

Structsum Generation for Faster Text Comprehension

Parag Jain, Andreea Marzoca, Francesco Piccinno

TL;DR

StructSum advances rapid text comprehension by transforming passages into structured representations (tables and mind maps) using divide-and-generate and iterative prompting strategies, respectively. It introduces three output critics—Factuality, Local Structure, and Global Structure—and an Auto-QA framework to quantify semantic coverage, validated against SQuAD and via a user-text comprehension study. The approach, built on PaLM-2 Unicorn, demonstrates substantial improvements in output quality and significant reductions in time needed to answer questions when using StructSums, with notable gains for mind maps (+37pp) and tables (+15pp) and faster comprehension in controlled user studies. Together, these contributions establish a scalable evaluation and generation pipeline for multi-modal structured outputs and highlight practical benefits for information-seeking tasks.

Abstract

We consider the task of generating structured representations of text using large language models (LLMs). We focus on tables and mind maps as representative modalities. Tables are more organized way of representing data, while mind maps provide a visually dynamic and flexible approach, particularly suitable for sparse content. Despite the effectiveness of LLMs on different tasks, we show that current models struggle with generating structured outputs. In response, we present effective prompting strategies for both of these tasks. We introduce a taxonomy of problems around factuality, global and local structure, common to both modalities and propose a set of critiques to tackle these issues resulting in an absolute improvement in accuracy of +37pp (79%) for mind maps and +15pp (78%) for tables. To evaluate semantic coverage of generated structured representations we propose Auto-QA, and we verify the adequacy of Auto-QA using SQuAD dataset. We further evaluate the usefulness of structured representations via a text comprehension user study. The results show a significant reduction in comprehension time compared to text when using table (42.9%) and mind map (31.9%), without loss in accuracy.

Structsum Generation for Faster Text Comprehension

TL;DR

StructSum advances rapid text comprehension by transforming passages into structured representations (tables and mind maps) using divide-and-generate and iterative prompting strategies, respectively. It introduces three output critics—Factuality, Local Structure, and Global Structure—and an Auto-QA framework to quantify semantic coverage, validated against SQuAD and via a user-text comprehension study. The approach, built on PaLM-2 Unicorn, demonstrates substantial improvements in output quality and significant reductions in time needed to answer questions when using StructSums, with notable gains for mind maps (+37pp) and tables (+15pp) and faster comprehension in controlled user studies. Together, these contributions establish a scalable evaluation and generation pipeline for multi-modal structured outputs and highlight practical benefits for information-seeking tasks.

Abstract

We consider the task of generating structured representations of text using large language models (LLMs). We focus on tables and mind maps as representative modalities. Tables are more organized way of representing data, while mind maps provide a visually dynamic and flexible approach, particularly suitable for sparse content. Despite the effectiveness of LLMs on different tasks, we show that current models struggle with generating structured outputs. In response, we present effective prompting strategies for both of these tasks. We introduce a taxonomy of problems around factuality, global and local structure, common to both modalities and propose a set of critiques to tackle these issues resulting in an absolute improvement in accuracy of +37pp (79%) for mind maps and +15pp (78%) for tables. To evaluate semantic coverage of generated structured representations we propose Auto-QA, and we verify the adequacy of Auto-QA using SQuAD dataset. We further evaluate the usefulness of structured representations via a text comprehension user study. The results show a significant reduction in comprehension time compared to text when using table (42.9%) and mind map (31.9%), without loss in accuracy.
Paper Structure (37 sections, 1 equation, 18 figures, 7 tables, 1 algorithm)

This paper contains 37 sections, 1 equation, 18 figures, 7 tables, 1 algorithm.

Figures (18)

  • Figure 1: Overview of (a) tables and (b) mind map generation prompts. The prompting steps are colored. Figure (a) illustrates the divide-and-generate prompt. The input passage is initially segmented into sub-passages, followed by the generation of multiple tables. Figure (b) demonstrates the generation process for mind maps. After the main concept has been generated, an iterative expansion phase ensues, during which the mind map is expanded until termination.
  • Figure 2: Example table generation for the text at top, comparing single table (left) vs multiple table generation (right). Some parts in the table and text were truncated (...) for readability. The full example is reported in Figure \ref{['fig:full_table_example']}.
  • Figure 3: Example mind map output. The full example along with the input text is reported in Figure \ref{['fig:full_mindmap_example']}.
  • Figure 4: Auto-QA based coverage. A point $\langle X,Y \rangle$ in each line show that $X$% of data has at least $Y$% of coverage measured using Auto-QA.
  • Figure 5: Results for timed text comprehension based user study. Plots show 95% confidence interval over time taken in seconds to answer question with different structure combinations as context. For both tables (left) and mind map (right), compared to text only, we observe significant reduction ($42.9\%$ and $31.9\%$ resp.) in average time taken by annotators to answer the question.
  • ...and 13 more figures