Table of Contents
Fetching ...

Automatic Method Illustration Generation for AI Scientific Papers via Drawing Middleware Creation, Evolution, and Orchestration

Zhuoling Li, Jiarui Zhang, Ping Hu, Jason Kuen, Jiuxiang Gu, Hossein Rahmani, Jun Liu

Abstract

Method illustrations (MIs) play a crucial role in conveying the core ideas of scientific papers, yet their generation remains a labor-intensive process. Here, we take inspiration from human authors' drawing practices and correspondingly propose \textbf{FigAgent}, a novel multi-agent framework for high-quality automatic MI generation. Our FigAgent distills drawing experiences from similar components across MIs and encapsulates them into reusable drawing middlewares that can be orchestrated for MI generation, while evolving these middlewares to adapt to dynamically evolving drawing requirements. Besides, a novel Explore-and-Select drawing strategy is introduced to mimic the human-like trial-and-error manner for gradually constructing MIs with complex structures. Extensive experiments show the efficacy of our method.

Automatic Method Illustration Generation for AI Scientific Papers via Drawing Middleware Creation, Evolution, and Orchestration

Abstract

Method illustrations (MIs) play a crucial role in conveying the core ideas of scientific papers, yet their generation remains a labor-intensive process. Here, we take inspiration from human authors' drawing practices and correspondingly propose \textbf{FigAgent}, a novel multi-agent framework for high-quality automatic MI generation. Our FigAgent distills drawing experiences from similar components across MIs and encapsulates them into reusable drawing middlewares that can be orchestrated for MI generation, while evolving these middlewares to adapt to dynamically evolving drawing requirements. Besides, a novel Explore-and-Select drawing strategy is introduced to mimic the human-like trial-and-error manner for gradually constructing MIs with complex structures. Extensive experiments show the efficacy of our method.

Paper Structure

This paper contains 11 sections, 3 equations, 2 figures, 4 tables.

Figures (2)

  • Figure 1: Overview of our FigAgent. Our framework comprises five agents: CA, PA, DA, EA, and RA. In the Middleware Creation and Evolution stage, the CA creates a set of middlewares $\mathcal{M}$ from the automatically collected $\mathcal{D}_{\text{exp}}$, and evolves $\mathcal{M}$ to improve its efficacy and maintain alignment with the evolving research community. During Middleware Orchestration-based MI Generation, the PA parses paper text $p$ into a concept graph $\mathcal{G}$. The DA then orchestrates middlewares from $\mathcal{M}$ to gradually render concepts onto the canvas, under the Explore-and-Select strategy (we show a toy case where $(a_1,a_2,\beta)$ is simply set to $(3,3,1)$ for ease of understanding), with the EA providing evaluation feedback. Lastly, the RA refines the result to produce the MI $m$.
  • Figure 2: Qualitative Results. Red boxes denote structural defects, e.g., layout inconsistencies and missing components. Orange boxes denote low-fidelity details like blurriness. Zoom in for better view. More zoomed-in examples are in Supplementary.