Table of Contents
Fetching ...

PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides

Hao Zheng, Xinyan Guan, Hao Kong, Jia Zheng, Weixiang Zhou, Hongyu Lin, Yaojie Lu, Ben He, Xianpei Han, Le Sun

TL;DR

<1> The paper tackles automatic presentation generation by reframing it as an edit-based process guided by reference slides, addressing content quality, visual design, and coherence. <2> It introduces PPTAgent, which analyzes reference presentations (stage i) to extract slide types and content schemas and then generates new slides by iterative edits (stage ii) using HTML-based representations and self-correction. <3> To assess quality, PPTEval provides a holistic, reference-free evaluation across Content, Design, and Coherence, validated against human judgments. <4> Experiments on Zenodo10K across multiple domains show PPTAgent outperforming rule-based and template-based baselines, with robust generation and strong evaluation alignment. <5> The work advances automatic presentation synthesis by integrating structural understanding, editable workflows, and holistic evaluation, with releasing code and data to foster future research.

Abstract

Automatically generating presentations from documents is a challenging task that requires accommodating content quality, visual appeal, and structural coherence. Existing methods primarily focus on improving and evaluating the content quality in isolation, overlooking visual appeal and structural coherence, which limits their practical applicability. To address these limitations, we propose PPTAgent, which comprehensively improves presentation generation through a two-stage, edit-based approach inspired by human workflows. PPTAgent first analyzes reference presentations to extract slide-level functional types and content schemas, then drafts an outline and iteratively generates editing actions based on selected reference slides to create new slides. To comprehensively evaluate the quality of generated presentations, we further introduce PPTEval, an evaluation framework that assesses presentations across three dimensions: Content, Design, and Coherence. Results demonstrate that PPTAgent significantly outperforms existing automatic presentation generation methods across all three dimensions.

PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides

TL;DR

<1> The paper tackles automatic presentation generation by reframing it as an edit-based process guided by reference slides, addressing content quality, visual design, and coherence. <2> It introduces PPTAgent, which analyzes reference presentations (stage i) to extract slide types and content schemas and then generates new slides by iterative edits (stage ii) using HTML-based representations and self-correction. <3> To assess quality, PPTEval provides a holistic, reference-free evaluation across Content, Design, and Coherence, validated against human judgments. <4> Experiments on Zenodo10K across multiple domains show PPTAgent outperforming rule-based and template-based baselines, with robust generation and strong evaluation alignment. <5> The work advances automatic presentation synthesis by integrating structural understanding, editable workflows, and holistic evaluation, with releasing code and data to foster future research.

Abstract

Automatically generating presentations from documents is a challenging task that requires accommodating content quality, visual appeal, and structural coherence. Existing methods primarily focus on improving and evaluating the content quality in isolation, overlooking visual appeal and structural coherence, which limits their practical applicability. To address these limitations, we propose PPTAgent, which comprehensively improves presentation generation through a two-stage, edit-based approach inspired by human workflows. PPTAgent first analyzes reference presentations to extract slide-level functional types and content schemas, then drafts an outline and iteratively generates editing actions based on selected reference slides to create new slides. To comprehensively evaluate the quality of generated presentations, we further introduce PPTEval, an evaluation framework that assesses presentations across three dimensions: Content, Design, and Coherence. Results demonstrate that PPTAgent significantly outperforms existing automatic presentation generation methods across all three dimensions.
Paper Structure (45 sections, 2 equations, 23 figures, 8 tables, 1 algorithm)

This paper contains 45 sections, 2 equations, 23 figures, 8 tables, 1 algorithm.

Figures (23)

  • Figure 1: Comparison between our PPTAgent approach (left) and the conventional abstractive summarization method (right).
  • Figure 2: Overview of the PPTAgent workflow. Stagei: Presentation Analysis involves analyzing the input presentation to cluster slides into groups and extract their content schemas. Stage ii: Presentation Generation generates new presentations guided by the outline, incorporating self-correction mechanisms to ensure robustness.
  • Figure 3: PPTEval assesses presentations from three dimensions: content, design, and coherence.
  • Figure 4: Score distributions of presentations generated by PPTAgent , DocPres, and KCTV across the three evaluation dimensions: Content, Design, and Coherence, as assessed by PPTEval .
  • Figure 5: Comparative analysis of presentation generation across different methods. PPTAgent generates under different reference presentations, indicated as PPTAgent(a) and PPTAgent(b).
  • ...and 18 more figures