Table of Contents
Fetching ...

Paper2SysArch: Structure-Constrained System Architecture Generation from Scientific Papers

Ziyi Guo, Zhou Liu, Wentao Zhang

TL;DR

This work tackles the lack of standardized evaluation for generating system-architecture diagrams from scientific papers by proposing the Paper2SysArch Benchmark, a large-scale dataset of 3,000 paper–diagram pairs with a three-tier semantic, layout, and visual evaluation framework. It also introduces Paper2SysArch, an end-to-end multi-agent system that converts papers into editable, structured diagrams using a hierarchical three-layer graph representation and a distributed generation pipeline. The benchmark emphasizes structure-centric semantics via a machine-readable GraphJSON ground truth, enabling reproducible and fair comparisons across methods. Findings show strong visual and layout performance from the agent-based approach, with semantic fidelity remaining the main challenge, highlighting a promising direction for controllable, automated scientific visualization and future improvements in semantic reconstruction and layout flexibility.

Abstract

The manual creation of system architecture diagrams for scientific papers is a time-consuming and subjective process, while existing generative models lack the necessary structural control and semantic understanding for this task. A primary obstacle hindering research and development in this domain has been the profound lack of a standardized benchmark to quantitatively evaluate the automated generation of diagrams from text. To address this critical gap, we introduce a novel and comprehensive benchmark, the first of its kind, designed to catalyze progress in automated scientific visualization. It consists of 3,000 research papers paired with their corresponding high-quality ground-truth diagrams and is accompanied by a three-tiered evaluation metric assessing semantic accuracy, layout coherence, and visual quality. Furthermore, to establish a strong baseline on this new benchmark, we propose Paper2SysArch, an end-to-end system that leverages multi-agent collaboration to convert papers into structured, editable diagrams. To validate its performance on complex cases, the system was evaluated on a manually curated and more challenging subset of these papers, where it achieves a composite score of 69.0. This work's principal contribution is the establishment of a large-scale, foundational benchmark to enable reproducible research and fair comparison. Meanwhile, our proposed system serves as a viable proof-of-concept, demonstrating a promising path forward for this complex task.

Paper2SysArch: Structure-Constrained System Architecture Generation from Scientific Papers

TL;DR

This work tackles the lack of standardized evaluation for generating system-architecture diagrams from scientific papers by proposing the Paper2SysArch Benchmark, a large-scale dataset of 3,000 paper–diagram pairs with a three-tier semantic, layout, and visual evaluation framework. It also introduces Paper2SysArch, an end-to-end multi-agent system that converts papers into editable, structured diagrams using a hierarchical three-layer graph representation and a distributed generation pipeline. The benchmark emphasizes structure-centric semantics via a machine-readable GraphJSON ground truth, enabling reproducible and fair comparisons across methods. Findings show strong visual and layout performance from the agent-based approach, with semantic fidelity remaining the main challenge, highlighting a promising direction for controllable, automated scientific visualization and future improvements in semantic reconstruction and layout flexibility.

Abstract

The manual creation of system architecture diagrams for scientific papers is a time-consuming and subjective process, while existing generative models lack the necessary structural control and semantic understanding for this task. A primary obstacle hindering research and development in this domain has been the profound lack of a standardized benchmark to quantitatively evaluate the automated generation of diagrams from text. To address this critical gap, we introduce a novel and comprehensive benchmark, the first of its kind, designed to catalyze progress in automated scientific visualization. It consists of 3,000 research papers paired with their corresponding high-quality ground-truth diagrams and is accompanied by a three-tiered evaluation metric assessing semantic accuracy, layout coherence, and visual quality. Furthermore, to establish a strong baseline on this new benchmark, we propose Paper2SysArch, an end-to-end system that leverages multi-agent collaboration to convert papers into structured, editable diagrams. To validate its performance on complex cases, the system was evaluated on a manually curated and more challenging subset of these papers, where it achieves a composite score of 69.0. This work's principal contribution is the establishment of a large-scale, foundational benchmark to enable reproducible research and fair comparison. Meanwhile, our proposed system serves as a viable proof-of-concept, demonstrating a promising path forward for this complex task.

Paper Structure

This paper contains 61 sections, 4 equations, 15 figures, 2 tables.

Figures (15)

  • Figure 1: Typical failure cases of general-purpose image generation models for scientific diagrams (evaluated using the Nanobanana gemini model).
  • Figure 2: Domain distribution of system architecture diagrams in the 108 papers.
  • Figure 3: Structural diversity and complexity metrics across domains.
  • Figure 4: Overall evaluation pipeline of the Paper2SysArch benchmark.
  • Figure 5: Overview of the Paper2SysArch Agent.
  • ...and 10 more figures