Table of Contents
Fetching ...

BEAGLE: Behavior-Enforced Agent for Grounded Learner Emulation

Hanchen David Wang, Clayton Cohn, Zifan Xu, Siyuan Guo, Gautam Biswas, Meiyi Ma

TL;DR

BEAGLE introduces a neuro-symbolic framework to synthesize authentic student learning trajectories by integrating Self-Regulated Learning with a semi-Markov controller, Bayesian Knowledge Tracing with explicit flaw injection, and a decoupled Strategist/Executor generation pipeline. This architecture enforces novice-bound behavior and prevents silent self-correction, yielding trajectories that closely mirror real student data across behavioral, epistemic, and perceptual dimensions. Evaluations on Python programming tasks show BEAGLE reduces competency bias and achieves human-like nonlinearity, with a human Turing test indicating traces are indistinguishable from real data within a statistical equivalence bound. Ablation studies confirm the critical role of the semi-Markov dynamics and epistemic constraints in sustaining realistic learning progressions, while the framework enables scalable stress-testing of tutoring interventions.

Abstract

Simulating student learning behaviors in open-ended problem-solving environments holds potential for education research, from training adaptive tutoring systems to stress-testing pedagogical interventions. However, collecting authentic data is challenging due to privacy concerns and the high cost of longitudinal studies. While Large Language Models (LLMs) offer a promising path to student simulation, they suffer from competency bias, optimizing for efficient correctness rather than the erratic, iterative struggle characteristic of novice learners. We present BEAGLE, a neuro-symbolic framework that addresses this bias by incorporating Self-Regulated Learning (SRL) theory into a novel architecture. BEAGLE integrates three key technical innovations: (1) a semi-Markov model that governs the timing and transitions of cognitive behaviors and metacognitive behaviors; (2) Bayesian Knowledge Tracing with explicit flaw injection to enforce realistic knowledge gaps and "unknown unknowns"; and (3) a decoupled agent design that separates high-level strategy use from code generation actions to prevent the model from silently correcting its own intentional errors. In evaluations on Python programming tasks, BEAGLE significantly outperforms state-of-the-art baselines in reproducing authentic trajectories. In a human Turing test, users were unable to distinguish synthetic traces from real student data, achieving an accuracy indistinguishable from random guessing (52.8%).

BEAGLE: Behavior-Enforced Agent for Grounded Learner Emulation

TL;DR

BEAGLE introduces a neuro-symbolic framework to synthesize authentic student learning trajectories by integrating Self-Regulated Learning with a semi-Markov controller, Bayesian Knowledge Tracing with explicit flaw injection, and a decoupled Strategist/Executor generation pipeline. This architecture enforces novice-bound behavior and prevents silent self-correction, yielding trajectories that closely mirror real student data across behavioral, epistemic, and perceptual dimensions. Evaluations on Python programming tasks show BEAGLE reduces competency bias and achieves human-like nonlinearity, with a human Turing test indicating traces are indistinguishable from real data within a statistical equivalence bound. Ablation studies confirm the critical role of the semi-Markov dynamics and epistemic constraints in sustaining realistic learning progressions, while the framework enables scalable stress-testing of tutoring interventions.

Abstract

Simulating student learning behaviors in open-ended problem-solving environments holds potential for education research, from training adaptive tutoring systems to stress-testing pedagogical interventions. However, collecting authentic data is challenging due to privacy concerns and the high cost of longitudinal studies. While Large Language Models (LLMs) offer a promising path to student simulation, they suffer from competency bias, optimizing for efficient correctness rather than the erratic, iterative struggle characteristic of novice learners. We present BEAGLE, a neuro-symbolic framework that addresses this bias by incorporating Self-Regulated Learning (SRL) theory into a novel architecture. BEAGLE integrates three key technical innovations: (1) a semi-Markov model that governs the timing and transitions of cognitive behaviors and metacognitive behaviors; (2) Bayesian Knowledge Tracing with explicit flaw injection to enforce realistic knowledge gaps and "unknown unknowns"; and (3) a decoupled agent design that separates high-level strategy use from code generation actions to prevent the model from silently correcting its own intentional errors. In evaluations on Python programming tasks, BEAGLE significantly outperforms state-of-the-art baselines in reproducing authentic trajectories. In a human Turing test, users were unable to distinguish synthetic traces from real student data, achieving an accuracy indistinguishable from random guessing (52.8%).
Paper Structure (122 sections, 28 equations, 18 figures, 15 tables)

This paper contains 122 sections, 28 equations, 18 figures, 15 tables.

Figures (18)

  • Figure 1: Cognitive behavior transitions reveal competency bias: real students exhibit debugging persistence (D$\rightarrow$D: 28%) while LLMs prefer constructing linearly. C (Con-struct-ing), D (De-bug-ging), A (As-sess-ing).
  • Figure 2: Neuro-symbolic framework interacting with the environment, with tutor intervention.
  • Figure 3: Overview of the BEAGLE agent architecture. The Symbolic Control (left) governs high-level behavior through semi-Markov models for metacognitive behaviors (Plan-ning, Re-flect-ing, Mon-i-tor-ing, En-act-ing) and cognitive states (Con-struct-ing, As-sess-ing, De-bug-ging), alongside BKT for modeling knowledge acquisition. The Neural Action (right) implements a two-stage LLM pipeline: a Strategist that determines goals, mindset, and directives based on the current state, and an Executor that generates naturalistic code and monologue. Buffer/Thought (avoiding repetition) and Memories (learning from short-term errors) mechanisms ensure coherent, non-repetitive behavior for both LLM agents (More details of prompts in App. \ref{['app:prompts']}). The Environment (bottom) executes code via an IDE Oracle and provides feedback signals that update both the symbolic state machines and the neural action's context.
  • Figure 4: Cognitive behavior transitions. BEAGLE (blue) closely matches real data.
  • Figure 5: Metacognitive behavior transitions (changes only). BEAGLE closely matches Real. Baselines show extreme P$\rightarrow$R bias (axis break at 50%).
  • ...and 13 more figures

Theorems & Definitions (1)

  • Definition 1