From EduVisBench to EduVisAgent: A Benchmark and Multi-Agent Framework for Reasoning-Driven Pedagogical Visualization
Haonian Ji, Shi Qiu, Siyang Xin, Siwei Han, Zhaorun Chen, Dake Zhang, Hongyi Wang, Huaxiu Yao
TL;DR
The paper identifies a critical gap in AI-generated visual pedagogy for STEM problem solving and presents EduVisBench, a multi-domain benchmark with a five-dimension rubric to evaluate visually grounded reasoning. To close the identified gaps, it introduces EduVisAgent, a modular multi-agent framework that orchestrates instructional planning, reasoning decomposition, metacognitive prompting, and visualization design to produce interactive, pedagogy-aligned visuals. Experimental results show EduVisAgent achieving an average of 81.6% on EduVisBench, a 40.2% relative improvement over the best baseline, underscoring the value of coordinated, domain-aware agent collaboration for educational visualization. This work advances the capacity to generate effective, interactive visual explanations and provides a scalable platform for evaluating visually grounded pedagogy in AI systems.
Abstract
While foundation models (FMs), such as diffusion models and large vision-language models (LVLMs), have been widely applied in educational contexts, their ability to generate pedagogically effective visual explanations remains limited. Most existing approaches focus primarily on textual reasoning, overlooking the critical role of structured and interpretable visualizations in supporting conceptual understanding. To better assess the visual reasoning capabilities of FMs in educational settings, we introduce EduVisBench, a multi-domain, multi-level benchmark. EduVisBench features diverse STEM problem sets requiring visually grounded solutions, along with a fine-grained evaluation rubric informed by pedagogical theory. Our empirical analysis reveals that existing models frequently struggle with the inherent challenge of decomposing complex reasoning and translating it into visual representations aligned with human cognitive processes. To address these limitations, we propose EduVisAgent, a multi-agent collaborative framework that coordinates specialized agents for instructional planning, reasoning decomposition, metacognitive prompting, and visualization design. Experimental results show that EduVisAgent substantially outperforms all baselines, achieving a 40.2% improvement and delivering more educationally aligned visualizations. EduVisBench and EduVisAgent are available at https://github.com/aiming-lab/EduVisBench and https://github.com/aiming-lab/EduVisAgent.
