From Bugs to Breakthroughs: Novice Errors in CS2
Nadja Just, Janet Siegmund, Belinda Schantong
TL;DR
The paper investigates novice CS2 programming errors and their evolution over a semester by extending the McCall & Kölling error categorization framework to include logical errors. Using unit-test-driven detection across 67 submissions from 13 students (970? 710? total errors reported as 710 in the abstract), it reveals that semantic errors dominate early while syntax remains largely non-problematic, with logical errors peaking mid-course and then declining as students practice new concepts, particularly data structures. Key contributions include replication of a prior CS2 error study, addition of logical-error categories, a longitudinal perspective on student development, and a publicly available replication package for broader reuse. The findings have practical implications for CS2 pedagogy, suggesting a focus on threshold concepts in data structures and strategy-driven use of IDEs and tests to shift students toward concept-focused programming.
Abstract
Background: Programming is a fundamental skill in computer science and software engineering specifically. Mastering it is a challenge for novices, which is evidenced by numerous errors that students make during programming assignments. Objective: In our study, we want to identify common programming errors in CS2 courses and understand how students evolve over time. Method: To this end, we conducted a longitudinal study of errors that students of a CS2 course made in subsequent programming assignments. Specifically, we manually categorized 710 errors based on a modified version of an established error framework. Result: We could observe a learning curve of students, such that they start out with only few syntactical errors, but with a high number of semantic errors. During the course, the syntax and semantic errors almost completely vanish, but logical errors remain consistently present. Conclusion: Thus, students have only little trouble with learning the programming language, but need more time to understand and express concepts in a programming language.
