Automated Program Repair of Uncompilable Student Code
Griffin Pitts, Aum Pandya, Darsh Rank, Tirth Bhatt, Muntasir Hoq, Bita Akram
TL;DR
Uncompilable student code poses a barrier to cognitive modeling of programming learning. The paper evaluates syntax-only automated repair using large language models to recover compilable submissions while preserving structural intent. It analyzes CodeWorkout Java submissions, sampling 100 uncompilable cases and testing three LLMs under low/high-context prompts, measuring compilability, edit distance, and expert judgments of Structural and Logical Preservation. Findings show near-universal compilability across models, with GPT-5 delivering the strongest preservation of control flow and logic, illustrating the potential and limitations of LLM-based APR for enriching student learning analytics.
Abstract
A significant portion of student programming submissions in CS1 learning environments are uncompilable, limiting their use in student modeling and downstream knowledge tracing. Traditional modeling pipelines often exclude these cases, discarding observations of student learning. This study investigates automated program repair as a strategy to recover uncompilable code while preserving students' structural intent for use in student modeling. Within this framework, we assess large language models (LLMs) as repair agents under high- and low-context prompting conditions. Repairs were evaluated for compilability, edit distance, and preservation of students' original structure and logic. While all models produced compilable repairs, they differed in how well they preserve students' control flow and code structure, affecting their pedagogical utility. By recovering uncompilable submissions, this work enables richer and more comprehensive analyses of learners' coding processes and development over time.
