Student Development Agent: Risk-free Simulation for Evaluating AIED Innovations
Jianxiao Jiang, Yu Zhang
TL;DR
This work addresses the challenge of evaluating AI-enabled educational designs without risking real students by proposing a large language model (LLM)–based Student Development Agent that simulates long-term developmental trajectories under varied learning environments. The approach combines a structured input-output framework, a general education categorization, empirical findings integration, and iterative prompting to produce self-evolving predictions of both learning behaviors and developmental dimensions. Validation in a multi-agent MAIC case study shows competitive predictive performance against baselines, particularly for non-cognitive outcomes, while highlighting the ethical and practical value of risk-free evaluation. Collectively, the framework offers a scalable, transparent, and adaptable platform for informing AIED design, policy, and future methodological advances in educational research.
Abstract
In the age of AI-powered educational (AIED) innovation, evaluating the developmental consequences of novel designs before they are exposed to students has become both essential and challenging. Since such interventions may carry irreversible effects, it is critical to anticipate not only potential benefits but also possible harms. This study proposes a student development agent framework based on large language models (LLMs), designed to simulate how students with diverse characteristics may evolve under different educational settings without administering them to real students. By validating the approach through a case study on a multi-agent learning environment (MAIC), we demonstrate that the agent's predictions align with real student outcomes in non-cognitive developments. The results suggest that LLM-based simulations hold promise for evaluating AIED innovations efficiently and ethically. Future directions include enhancing profile structures, incorporating fine-tuned or small task-specific models, validating effects of empirical findings, interpreting simulated data and optimizing evaluation methods.
