Table of Contents
Fetching ...

Flexible Control Flow Graph Alignment for Delivering Data-Driven Feedback to Novice Programming Learners

Md Towhidul Absar Chowdhury, Maheen Riaz Contractor, Carlos R. Rivero

TL;DR

This work tackles scalable feedback delivery for novice programmers by sidestepping rigid program comparisons in automated repair. It extends the CLARA framework with a flexible control-flow-graph (CFG) alignment that annotates CFG nodes semantically and considers topology, enabling repairs across more diverse incorrect programs. Empirical results on Codeforces problems show that flexible alignment achieves about 46% full repairs, markedly higher than the original CLARA baseline (~5%), and remains effective across problem difficulties. The approach also advances parser/interpreter capabilities to handle real-world Python constructs and demonstrates practical potential for data-driven feedback at scale, with future directions including integration with variable-trace repairs and large language models.

Abstract

Supporting learners in introductory programming assignments at scale is a necessity. This support includes automated feedback on what learners did incorrectly. Existing approaches cast the problem as automatically repairing learners' incorrect programs extrapolating the data from an existing correct program from other learners. However, such approaches are limited because they only compare programs with similar control flow and order of statements. A potentially valuable set of repair feedback from flexible comparisons is thus missing. In this paper, we present several modifications to CLARA, a data-driven automated repair approach that is open source, to deal with real-world introductory programs. We extend CLARA's abstract syntax tree processor to handle common introductory programming constructs. Additionally, we propose a flexible alignment algorithm over control flow graphs where we enrich nodes with semantic annotations extracted from programs using operations and calls. Using this alignment, we modify an incorrect program's control flow graph to match the correct programs to apply CLARA's original repair process. We evaluate our approach against a baseline on the twenty most popular programming problems in Codeforces. Our results indicate that flexible alignment has a significantly higher percentage of successful repairs at 46% compared to 5% for baseline CLARA. Our implementation is available at https://github.com/towhidabsar/clara.

Flexible Control Flow Graph Alignment for Delivering Data-Driven Feedback to Novice Programming Learners

TL;DR

This work tackles scalable feedback delivery for novice programmers by sidestepping rigid program comparisons in automated repair. It extends the CLARA framework with a flexible control-flow-graph (CFG) alignment that annotates CFG nodes semantically and considers topology, enabling repairs across more diverse incorrect programs. Empirical results on Codeforces problems show that flexible alignment achieves about 46% full repairs, markedly higher than the original CLARA baseline (~5%), and remains effective across problem difficulties. The approach also advances parser/interpreter capabilities to handle real-world Python constructs and demonstrates practical potential for data-driven feedback at scale, with future directions including integration with variable-trace repairs and large language models.

Abstract

Supporting learners in introductory programming assignments at scale is a necessity. This support includes automated feedback on what learners did incorrectly. Existing approaches cast the problem as automatically repairing learners' incorrect programs extrapolating the data from an existing correct program from other learners. However, such approaches are limited because they only compare programs with similar control flow and order of statements. A potentially valuable set of repair feedback from flexible comparisons is thus missing. In this paper, we present several modifications to CLARA, a data-driven automated repair approach that is open source, to deal with real-world introductory programs. We extend CLARA's abstract syntax tree processor to handle common introductory programming constructs. Additionally, we propose a flexible alignment algorithm over control flow graphs where we enrich nodes with semantic annotations extracted from programs using operations and calls. Using this alignment, we modify an incorrect program's control flow graph to match the correct programs to apply CLARA's original repair process. We evaluate our approach against a baseline on the twenty most popular programming problems in Codeforces. Our results indicate that flexible alignment has a significantly higher percentage of successful repairs at 46% compared to 5% for baseline CLARA. Our implementation is available at https://github.com/towhidabsar/clara.
Paper Structure (34 sections, 1 equation, 17 figures, 7 tables, 3 algorithms)

This paper contains 34 sections, 1 equation, 17 figures, 7 tables, 3 algorithms.

Figures (17)

  • Figure 1: Correct and incorrect programs, edits of abstract syntax trees derived from the programs and code fragments
  • Figure 2: CLARA's workflow: each program is translated into a model. Both models are aligned. If they match, the repairer and the interpreter exchange model and trace information using a test case until the incorrect program's model passes such test case.
  • Figure 3: Sample Python source code
  • Figure 4: Model generated by CLARA for the program in Figure \ref{['fig:PythonSourceCodeForModel']}
  • Figure 5: Sample correct and incorrect program models
  • ...and 12 more figures