Table of Contents
Fetching ...

SciDoc2Diagrammer-MAF: Towards Generation of Scientific Diagrams from Documents guided by Multi-Aspect Feedback Refinement

Ishani Mondal, Zongxia Li, Yufang Hou, Anandhavelu Natarajan, Aparna Garimella, Jordan Boyd-Graber

TL;DR

A multi-step pipeline SciDoc2Diagrammer that generates diagrams based on user intentions using intermediate code generation and develops SciDoc2Diagrammer-Multi-Aspect-Feedback (MAF), a refinement strategy that significantly enhances factual correctness and visual appeal and outperforms existing models on both automatic and human judgement.

Abstract

Automating the creation of scientific diagrams from academic papers can significantly streamline the development of tutorials, presentations, and posters, thereby saving time and accelerating the process. Current text-to-image models struggle with generating accurate and visually appealing diagrams from long-context inputs. We propose SciDoc2Diagram, a task that extracts relevant information from scientific papers and generates diagrams, along with a benchmarking dataset, SciDoc2DiagramBench. We develop a multi-step pipeline SciDoc2Diagrammer that generates diagrams based on user intentions using intermediate code generation. We observed that initial diagram drafts were often incomplete or unfaithful to the source, leading us to develop SciDoc2Diagrammer-Multi-Aspect-Feedback (MAF), a refinement strategy that significantly enhances factual correctness and visual appeal and outperforms existing models on both automatic and human judgement.

SciDoc2Diagrammer-MAF: Towards Generation of Scientific Diagrams from Documents guided by Multi-Aspect Feedback Refinement

TL;DR

A multi-step pipeline SciDoc2Diagrammer that generates diagrams based on user intentions using intermediate code generation and develops SciDoc2Diagrammer-Multi-Aspect-Feedback (MAF), a refinement strategy that significantly enhances factual correctness and visual appeal and outperforms existing models on both automatic and human judgement.

Abstract

Automating the creation of scientific diagrams from academic papers can significantly streamline the development of tutorials, presentations, and posters, thereby saving time and accelerating the process. Current text-to-image models struggle with generating accurate and visually appealing diagrams from long-context inputs. We propose SciDoc2Diagram, a task that extracts relevant information from scientific papers and generates diagrams, along with a benchmarking dataset, SciDoc2DiagramBench. We develop a multi-step pipeline SciDoc2Diagrammer that generates diagrams based on user intentions using intermediate code generation. We observed that initial diagram drafts were often incomplete or unfaithful to the source, leading us to develop SciDoc2Diagrammer-Multi-Aspect-Feedback (MAF), a refinement strategy that significantly enhances factual correctness and visual appeal and outperforms existing models on both automatic and human judgement.
Paper Structure (49 sections, 23 figures, 26 tables)

This paper contains 49 sections, 23 figures, 26 tables.

Figures (23)

  • Figure 1: An example of Diagram Generation and Refinement using SciDoc2Diagrammer-MAF with input as a paper and user-defined intent.
  • Figure 2: Outlines the procedure for generating diagrams from academic papers based on specific user intents, followed by refinement of each critic models.
  • Figure 3: The figure (an example from SciDoc2DiagramBench) depicts SciDoc2Diagrammer-MAF, which refines diagrams based on user intent and source document. Initially, SciDoc2Diagrammer creates a diagram (Step 2), which is refined through three critic modules which assess and provide feedback on necessary components, data accuracy, and visual design. The diagram is repeatedly refined, as shown on the left side of the figure (illustrating SumMAF), where feedback from all critics is integrated at Step 4’. The refinement continues, evaluated at Step 5’, until the maximum iterations are reached or the diagram meets the specified standards.
  • Figure 4: Average Human Rating on Completeness, Faithfulness and Layout on three subparts: SciDoc2DiagramBench-Gold (left), SciDoc2DiagramBench-Extended (middle) and SciMultiDoc2DiagramBench-Gold (right), it implies as the complexity of diagram creation increases, our proposed Multi-Aspect Refinement strategy appears to become more effective.
  • Figure 5: Refinement of a flowchart for detecting rumors from ma-etal-2017-detect, highlighting improvements in clarity and completeness after refinement but emphasizing the continued absence of a critical feedback loop.
  • ...and 18 more figures