Flexible Control Flow Graph Alignment for Delivering Data-Driven Feedback to Novice Programming Learners
Md Towhidul Absar Chowdhury, Maheen Riaz Contractor, Carlos R. Rivero
TL;DR
This work tackles scalable feedback delivery for novice programmers by sidestepping rigid program comparisons in automated repair. It extends the CLARA framework with a flexible control-flow-graph (CFG) alignment that annotates CFG nodes semantically and considers topology, enabling repairs across more diverse incorrect programs. Empirical results on Codeforces problems show that flexible alignment achieves about 46% full repairs, markedly higher than the original CLARA baseline (~5%), and remains effective across problem difficulties. The approach also advances parser/interpreter capabilities to handle real-world Python constructs and demonstrates practical potential for data-driven feedback at scale, with future directions including integration with variable-trace repairs and large language models.
Abstract
Supporting learners in introductory programming assignments at scale is a necessity. This support includes automated feedback on what learners did incorrectly. Existing approaches cast the problem as automatically repairing learners' incorrect programs extrapolating the data from an existing correct program from other learners. However, such approaches are limited because they only compare programs with similar control flow and order of statements. A potentially valuable set of repair feedback from flexible comparisons is thus missing. In this paper, we present several modifications to CLARA, a data-driven automated repair approach that is open source, to deal with real-world introductory programs. We extend CLARA's abstract syntax tree processor to handle common introductory programming constructs. Additionally, we propose a flexible alignment algorithm over control flow graphs where we enrich nodes with semantic annotations extracted from programs using operations and calls. Using this alignment, we modify an incorrect program's control flow graph to match the correct programs to apply CLARA's original repair process. We evaluate our approach against a baseline on the twenty most popular programming problems in Codeforces. Our results indicate that flexible alignment has a significantly higher percentage of successful repairs at 46% compared to 5% for baseline CLARA. Our implementation is available at https://github.com/towhidabsar/clara.
