Table of Contents
Fetching ...

Simulating Students with Large Language Models: A Review of Architecture, Mechanisms, and Role Modelling in Education with Generative AI

Luis Marquez-Carpintero, Alberto Lopez-Sellers, Miguel Cazorla

TL;DR

Simulated Students using large language models address a critical need to explore diverse learner profiles and pedagogical interventions in a safe, scalable manner. The paper synthesizes architectures, cognitive modelling mechanisms, personality modelling, and evaluation practices, highlighting direct prompt-based simulation, knowledge tracing, and knowledge graphs as central approaches. It finds that LLM-based agents can approximate basic learner trajectories and interactions, but face challenges in affective fidelity, long-term memory, and bias, necessitating rigorous validation and ethical safeguards. Overall, the work provides a taxonomy and roadmap for integrating generative AI into adaptive learning systems and teacher training, with implications for curriculum design and instructional evaluation.

Abstract

Simulated Students offer a valuable methodological framework for evaluating pedagogical approaches and modelling diverse learner profiles, tasks which are otherwise challenging to undertake systematically in real-world settings. Recent research has increasingly focused on developing such simulated agents to capture a range of learning styles, cognitive development pathways, and social behaviours. Among contemporary simulation techniques, the integration of large language models (LLMs) into educational research has emerged as a particularly versatile and scalable paradigm. LLMs afford a high degree of linguistic realism and behavioural adaptability, enabling agents to approximate cognitive processes and engage in contextually appropriate pedagogical dialogues. This paper presents a thematic review of empirical and methodological studies utilising LLMs to simulate student behaviour across educational environments. We synthesise current evidence on the capacity of LLM-based agents to emulate learner archetypes, respond to instructional inputs, and interact within multi-agent classroom scenarios. Furthermore, we examine the implications of such systems for curriculum development, instructional evaluation, and teacher training. While LLMs surpass rule-based systems in natural language generation and situational flexibility, ongoing concerns persist regarding algorithmic bias, evaluation reliability, and alignment with educational objectives. The review identifies existing technological and methodological gaps and proposes future research directions for integrating generative AI into adaptive learning systems and instructional design.

Simulating Students with Large Language Models: A Review of Architecture, Mechanisms, and Role Modelling in Education with Generative AI

TL;DR

Simulated Students using large language models address a critical need to explore diverse learner profiles and pedagogical interventions in a safe, scalable manner. The paper synthesizes architectures, cognitive modelling mechanisms, personality modelling, and evaluation practices, highlighting direct prompt-based simulation, knowledge tracing, and knowledge graphs as central approaches. It finds that LLM-based agents can approximate basic learner trajectories and interactions, but face challenges in affective fidelity, long-term memory, and bias, necessitating rigorous validation and ethical safeguards. Overall, the work provides a taxonomy and roadmap for integrating generative AI into adaptive learning systems and teacher training, with implications for curriculum design and instructional evaluation.

Abstract

Simulated Students offer a valuable methodological framework for evaluating pedagogical approaches and modelling diverse learner profiles, tasks which are otherwise challenging to undertake systematically in real-world settings. Recent research has increasingly focused on developing such simulated agents to capture a range of learning styles, cognitive development pathways, and social behaviours. Among contemporary simulation techniques, the integration of large language models (LLMs) into educational research has emerged as a particularly versatile and scalable paradigm. LLMs afford a high degree of linguistic realism and behavioural adaptability, enabling agents to approximate cognitive processes and engage in contextually appropriate pedagogical dialogues. This paper presents a thematic review of empirical and methodological studies utilising LLMs to simulate student behaviour across educational environments. We synthesise current evidence on the capacity of LLM-based agents to emulate learner archetypes, respond to instructional inputs, and interact within multi-agent classroom scenarios. Furthermore, we examine the implications of such systems for curriculum development, instructional evaluation, and teacher training. While LLMs surpass rule-based systems in natural language generation and situational flexibility, ongoing concerns persist regarding algorithmic bias, evaluation reliability, and alignment with educational objectives. The review identifies existing technological and methodological gaps and proposes future research directions for integrating generative AI into adaptive learning systems and instructional design.

Paper Structure

This paper contains 29 sections, 7 figures, 8 tables.

Figures (7)

  • Figure 1: PRISMA flow diagram illustrating the process of study identification, screening, eligibility, and inclusion for this review.
  • Figure 2: Evolution of the number of publications on Simulated Students.
  • Figure 3: Reflective mechanism of the TIR module.
  • Figure 4: Cyclic communication model in the AICademic multi-agent system.
  • Figure 5: EduAgent framework pipeline.
  • ...and 2 more figures