Table of Contents
Fetching ...

EfficientPosterGen: Semantic-aware Efficient Poster Generation via Token Compression and Accurate Violation Detection

Wenxin Tang, Jingyu Xiao, Yanpei Gong, Fengyuan Ran, Tongchuan Xia, Junliang Liu, Man Ho Lam, Wenxuan Wang, Michael R. Lyu

TL;DR

Extensive experiments demonstrate that EfficientPosterGen achieves substantial improvements in token efficiency and layout reliability while maintaining high poster quality, offering a scalable solution for automated academic poster generation.

Abstract

Automated academic poster generation aims to distill lengthy research papers into concise, visually coherent presentations. Existing Multimodal Large Language Models (MLLMs) based approaches, however, suffer from three critical limitations: low information density in full-paper inputs, excessive token consumption, and unreliable layout verification. We present EfficientPosterGen, an end-to-end framework that addresses these challenges through semantic-aware retrieval and token-efficient multimodal generation. EfficientPosterGen introduces three core innovations: (1) Semantic-aware Key Information Retrieval (SKIR), which constructs a semantic contribution graph to model inter-segment relationships and selectively preserves important content; (2) Visual-based Context Compression (VCC), which renders selected text segments into images to shift textual information into the visual modality, significantly reducing token usage while generating poster-ready bullet points; and (3) Agentless Layout Violation Detection (ALVD), a deterministic color-gradient-based algorithm that reliably detects content overflow and spatial sparsity without auxiliary MLLMs. Extensive experiments demonstrate that EfficientPosterGen achieves substantial improvements in token efficiency and layout reliability while maintaining high poster quality, offering a scalable solution for automated academic poster generation. Our code is available at https://github.com/vinsontang1/EfficientPosterGen-Code.

EfficientPosterGen: Semantic-aware Efficient Poster Generation via Token Compression and Accurate Violation Detection

TL;DR

Extensive experiments demonstrate that EfficientPosterGen achieves substantial improvements in token efficiency and layout reliability while maintaining high poster quality, offering a scalable solution for automated academic poster generation.

Abstract

Automated academic poster generation aims to distill lengthy research papers into concise, visually coherent presentations. Existing Multimodal Large Language Models (MLLMs) based approaches, however, suffer from three critical limitations: low information density in full-paper inputs, excessive token consumption, and unreliable layout verification. We present EfficientPosterGen, an end-to-end framework that addresses these challenges through semantic-aware retrieval and token-efficient multimodal generation. EfficientPosterGen introduces three core innovations: (1) Semantic-aware Key Information Retrieval (SKIR), which constructs a semantic contribution graph to model inter-segment relationships and selectively preserves important content; (2) Visual-based Context Compression (VCC), which renders selected text segments into images to shift textual information into the visual modality, significantly reducing token usage while generating poster-ready bullet points; and (3) Agentless Layout Violation Detection (ALVD), a deterministic color-gradient-based algorithm that reliably detects content overflow and spatial sparsity without auxiliary MLLMs. Extensive experiments demonstrate that EfficientPosterGen achieves substantial improvements in token efficiency and layout reliability while maintaining high poster quality, offering a scalable solution for automated academic poster generation. Our code is available at https://github.com/vinsontang1/EfficientPosterGen-Code.
Paper Structure (73 sections, 21 equations, 11 figures, 16 tables, 2 algorithms)

This paper contains 73 sections, 21 equations, 11 figures, 16 tables, 2 algorithms.

Figures (11)

  • Figure 1: An overview of the workflow of existing automated poster generation approaches, along with the major challenges they encounter in practice.
  • Figure 2: The framework of EfficientPosterGen.
  • Figure 3: An example of the layout verification process. (a) Input panel image. (b) Gradient magnitudes of vertical strips (bottom curve) and horizontal strips (left curve), with activated strips highlighted in blue (vertical) and yellow (horizontal). (c) Cartesian product of activated strips yields content regions (green) and their minimum enclosing rectangle (red). (d) Overflow detection via the red bounding box exceeding panel boundaries, and sparsity detection via the green coverage ratio.
  • Figure 4: Parameter sensitivity analysis. (a) Entropy reduction ratio under varying $\beta$ and $\gamma$. (b) Layout detection accuracy across strip numbers $N$ and activation thresholds $\tau_s$. (c)-(d) Compression ratio vs. normalized edit distance at varying DPI for GPT-5 and Qwen3-VL-8B-Instruct.
  • Figure 5: Examples of posters generated by different methods.
  • ...and 6 more figures