BEAGLE: Behavior-Enforced Agent for Grounded Learner Emulation

Hanchen David Wang; Clayton Cohn; Zifan Xu; Siyuan Guo; Gautam Biswas; Meiyi Ma

BEAGLE: Behavior-Enforced Agent for Grounded Learner Emulation

Hanchen David Wang, Clayton Cohn, Zifan Xu, Siyuan Guo, Gautam Biswas, Meiyi Ma

TL;DR

BEAGLE introduces a neuro-symbolic framework to synthesize authentic student learning trajectories by integrating Self-Regulated Learning with a semi-Markov controller, Bayesian Knowledge Tracing with explicit flaw injection, and a decoupled Strategist/Executor generation pipeline. This architecture enforces novice-bound behavior and prevents silent self-correction, yielding trajectories that closely mirror real student data across behavioral, epistemic, and perceptual dimensions. Evaluations on Python programming tasks show BEAGLE reduces competency bias and achieves human-like nonlinearity, with a human Turing test indicating traces are indistinguishable from real data within a statistical equivalence bound. Ablation studies confirm the critical role of the semi-Markov dynamics and epistemic constraints in sustaining realistic learning progressions, while the framework enables scalable stress-testing of tutoring interventions.

Abstract

Simulating student learning behaviors in open-ended problem-solving environments holds potential for education research, from training adaptive tutoring systems to stress-testing pedagogical interventions. However, collecting authentic data is challenging due to privacy concerns and the high cost of longitudinal studies. While Large Language Models (LLMs) offer a promising path to student simulation, they suffer from competency bias, optimizing for efficient correctness rather than the erratic, iterative struggle characteristic of novice learners. We present BEAGLE, a neuro-symbolic framework that addresses this bias by incorporating Self-Regulated Learning (SRL) theory into a novel architecture. BEAGLE integrates three key technical innovations: (1) a semi-Markov model that governs the timing and transitions of cognitive behaviors and metacognitive behaviors; (2) Bayesian Knowledge Tracing with explicit flaw injection to enforce realistic knowledge gaps and "unknown unknowns"; and (3) a decoupled agent design that separates high-level strategy use from code generation actions to prevent the model from silently correcting its own intentional errors. In evaluations on Python programming tasks, BEAGLE significantly outperforms state-of-the-art baselines in reproducing authentic trajectories. In a human Turing test, users were unable to distinguish synthetic traces from real student data, achieving an accuracy indistinguishable from random guessing (52.8%).

BEAGLE: Behavior-Enforced Agent for Grounded Learner Emulation

TL;DR

Abstract

Paper Structure (122 sections, 28 equations, 18 figures, 15 tables)

This paper contains 122 sections, 28 equations, 18 figures, 15 tables.

Introduction
Problem Formulation
The Multi-Dimensional Fidelity Problem.
Method
System Overview
Symbolic Control (Behavioral Fidelity).
Neural Action (Perceptual & Epistemic Fidelity).
Environment (Physical Grounding).
Symbolic Control
Semi-Markov Dynamics.
Knowledge Modeling.
Stochastic Interrupts.
Neural Action
The Strategist.
The Executor.
...and 107 more sections

Figures (18)

Figure 1: Cognitive behavior transitions reveal competency bias: real students exhibit debugging persistence (D$\rightarrow$D: 28%) while LLMs prefer constructing linearly. C (Con-struct-ing), D (De-bug-ging), A (As-sess-ing).
Figure 2: Neuro-symbolic framework interacting with the environment, with tutor intervention.
Figure 3: Overview of the BEAGLE agent architecture. The Symbolic Control (left) governs high-level behavior through semi-Markov models for metacognitive behaviors (Plan-ning, Re-flect-ing, Mon-i-tor-ing, En-act-ing) and cognitive states (Con-struct-ing, As-sess-ing, De-bug-ging), alongside BKT for modeling knowledge acquisition. The Neural Action (right) implements a two-stage LLM pipeline: a Strategist that determines goals, mindset, and directives based on the current state, and an Executor that generates naturalistic code and monologue. Buffer/Thought (avoiding repetition) and Memories (learning from short-term errors) mechanisms ensure coherent, non-repetitive behavior for both LLM agents (More details of prompts in App. \ref{['app:prompts']}). The Environment (bottom) executes code via an IDE Oracle and provides feedback signals that update both the symbolic state machines and the neural action's context.
Figure 4: Cognitive behavior transitions. BEAGLE (blue) closely matches real data.
Figure 5: Metacognitive behavior transitions (changes only). BEAGLE closely matches Real. Baselines show extreme P$\rightarrow$R bias (axis break at 50%).
...and 13 more figures

Theorems & Definitions (1)

Definition 1

BEAGLE: Behavior-Enforced Agent for Grounded Learner Emulation

TL;DR

Abstract

BEAGLE: Behavior-Enforced Agent for Grounded Learner Emulation

Authors

TL;DR

Abstract

Table of Contents

Figures (18)

Theorems & Definitions (1)