Flexible Control Flow Graph Alignment for Delivering Data-Driven Feedback to Novice Programming Learners

Md Towhidul Absar Chowdhury; Maheen Riaz Contractor; Carlos R. Rivero

Flexible Control Flow Graph Alignment for Delivering Data-Driven Feedback to Novice Programming Learners

Md Towhidul Absar Chowdhury, Maheen Riaz Contractor, Carlos R. Rivero

TL;DR

This work tackles scalable feedback delivery for novice programmers by sidestepping rigid program comparisons in automated repair. It extends the CLARA framework with a flexible control-flow-graph (CFG) alignment that annotates CFG nodes semantically and considers topology, enabling repairs across more diverse incorrect programs. Empirical results on Codeforces problems show that flexible alignment achieves about 46% full repairs, markedly higher than the original CLARA baseline (~5%), and remains effective across problem difficulties. The approach also advances parser/interpreter capabilities to handle real-world Python constructs and demonstrates practical potential for data-driven feedback at scale, with future directions including integration with variable-trace repairs and large language models.

Abstract

Supporting learners in introductory programming assignments at scale is a necessity. This support includes automated feedback on what learners did incorrectly. Existing approaches cast the problem as automatically repairing learners' incorrect programs extrapolating the data from an existing correct program from other learners. However, such approaches are limited because they only compare programs with similar control flow and order of statements. A potentially valuable set of repair feedback from flexible comparisons is thus missing. In this paper, we present several modifications to CLARA, a data-driven automated repair approach that is open source, to deal with real-world introductory programs. We extend CLARA's abstract syntax tree processor to handle common introductory programming constructs. Additionally, we propose a flexible alignment algorithm over control flow graphs where we enrich nodes with semantic annotations extracted from programs using operations and calls. Using this alignment, we modify an incorrect program's control flow graph to match the correct programs to apply CLARA's original repair process. We evaluate our approach against a baseline on the twenty most popular programming problems in Codeforces. Our results indicate that flexible alignment has a significantly higher percentage of successful repairs at 46% compared to 5% for baseline CLARA. Our implementation is available at https://github.com/towhidabsar/clara.

Flexible Control Flow Graph Alignment for Delivering Data-Driven Feedback to Novice Programming Learners

TL;DR

Abstract

Paper Structure (34 sections, 1 equation, 17 figures, 7 tables, 3 algorithms)

This paper contains 34 sections, 1 equation, 17 figures, 7 tables, 3 algorithms.

Introduction
Overview
Introduction to CLARA DBLP:conf/pldi/GulwaniRZ18
Models, processing and interpreting
Functions and locations
Expressions
Example of a model
Single function alignment and program repair
Parser and interpreter modifications
Print statements
Import statements
Variable assignment
Built-in Python functions
Variable additions and deletions
Input statements
...and 19 more sections

Figures (17)

Figure 1: Correct and incorrect programs, edits of abstract syntax trees derived from the programs and code fragments
Figure 2: CLARA's workflow: each program is translated into a model. Both models are aligned. If they match, the repairer and the interpreter exchange model and trace information using a test case until the incorrect program's model passes such test case.
Figure 3: Sample Python source code
Figure 4: Model generated by CLARA for the program in Figure \ref{['fig:PythonSourceCodeForModel']}
Figure 5: Sample correct and incorrect program models
...and 12 more figures

Flexible Control Flow Graph Alignment for Delivering Data-Driven Feedback to Novice Programming Learners

TL;DR

Abstract

Flexible Control Flow Graph Alignment for Delivering Data-Driven Feedback to Novice Programming Learners

Authors

TL;DR

Abstract

Table of Contents

Figures (17)