PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides
Hao Zheng, Xinyan Guan, Hao Kong, Jia Zheng, Weixiang Zhou, Hongyu Lin, Yaojie Lu, Ben He, Xianpei Han, Le Sun
TL;DR
<1> The paper tackles automatic presentation generation by reframing it as an edit-based process guided by reference slides, addressing content quality, visual design, and coherence. <2> It introduces PPTAgent, which analyzes reference presentations (stage i) to extract slide types and content schemas and then generates new slides by iterative edits (stage ii) using HTML-based representations and self-correction. <3> To assess quality, PPTEval provides a holistic, reference-free evaluation across Content, Design, and Coherence, validated against human judgments. <4> Experiments on Zenodo10K across multiple domains show PPTAgent outperforming rule-based and template-based baselines, with robust generation and strong evaluation alignment. <5> The work advances automatic presentation synthesis by integrating structural understanding, editable workflows, and holistic evaluation, with releasing code and data to foster future research.
Abstract
Automatically generating presentations from documents is a challenging task that requires accommodating content quality, visual appeal, and structural coherence. Existing methods primarily focus on improving and evaluating the content quality in isolation, overlooking visual appeal and structural coherence, which limits their practical applicability. To address these limitations, we propose PPTAgent, which comprehensively improves presentation generation through a two-stage, edit-based approach inspired by human workflows. PPTAgent first analyzes reference presentations to extract slide-level functional types and content schemas, then drafts an outline and iteratively generates editing actions based on selected reference slides to create new slides. To comprehensively evaluate the quality of generated presentations, we further introduce PPTEval, an evaluation framework that assesses presentations across three dimensions: Content, Design, and Coherence. Results demonstrate that PPTAgent significantly outperforms existing automatic presentation generation methods across all three dimensions.
