Table of Contents
Fetching ...

CFlow: Supporting Semantic Flow Analysis of Students' Code in Programming Problems at Scale

Ashley Ge Zhang, Xiaohang Tang, Steve Oney, Yan Chen

TL;DR

CFlow tackles the problem of analyzing thousands of student code submissions by introducing a scalable, semantically rich visualization that combines semantic aggregation with code structure. It leverages CodeBERT for embedding-line semantics and an LLM to label line-level errors, presenting results through three synchronized views (SAV, SHV, CDV) and a four-stage algorithm (identify steps, align lines, detect errors, cluster results). Evaluation against a strong baseline shows that CFlow speeds pattern identification, increases accuracy, and enhances the discovery of common mistakes in large classes. The findings support the approach's practical impact for instructors seeking scalable feedback and pattern exploration in scalable CS education settings.

Abstract

The high demand for computer science education has led to high enrollments, with thousands of students in many introductory courses. In such large courses, it can be overwhelmingly difficult for instructors to understand class-wide problem-solving patterns or issues, which is crucial for improving instruction and addressing important pedagogical challenges. In this paper, we propose a technique and system, CFlow, for creating understandable and navigable representations of code at scale. CFlow is able to represent thousands of code samples in a visualization that resembles a single code sample. CFlow creates scalable code representations by (1) clustering individual statements with similar semantic purposes, (2) presenting clustered statements in a way that maintains semantic relationships between statements, (3) representing the correctness of different variations as a histogram, and (4) allowing users to navigate through solutions interactively using semantic filters. With a multi-level view design, users can navigate high-level patterns, and low-level implementations. This is in contrast to prior tools that either limit their focus on isolated statements (and thus discard the surrounding context of those statements) or cluster entire code samples (which can lead to large numbers of clusters -- for example, if there are n code features and m implementations of each, there can be m^n clusters). We evaluated the effectiveness of CFlow with a comparison study, found participants using CFlow spent only half the time identifying mistakes and recalled twice as many desired patterns from over 6,000 submissions.

CFlow: Supporting Semantic Flow Analysis of Students' Code in Programming Problems at Scale

TL;DR

CFlow tackles the problem of analyzing thousands of student code submissions by introducing a scalable, semantically rich visualization that combines semantic aggregation with code structure. It leverages CodeBERT for embedding-line semantics and an LLM to label line-level errors, presenting results through three synchronized views (SAV, SHV, CDV) and a four-stage algorithm (identify steps, align lines, detect errors, cluster results). Evaluation against a strong baseline shows that CFlow speeds pattern identification, increases accuracy, and enhances the discovery of common mistakes in large classes. The findings support the approach's practical impact for instructors seeking scalable feedback and pattern exploration in scalable CS education settings.

Abstract

The high demand for computer science education has led to high enrollments, with thousands of students in many introductory courses. In such large courses, it can be overwhelmingly difficult for instructors to understand class-wide problem-solving patterns or issues, which is crucial for improving instruction and addressing important pedagogical challenges. In this paper, we propose a technique and system, CFlow, for creating understandable and navigable representations of code at scale. CFlow is able to represent thousands of code samples in a visualization that resembles a single code sample. CFlow creates scalable code representations by (1) clustering individual statements with similar semantic purposes, (2) presenting clustered statements in a way that maintains semantic relationships between statements, (3) representing the correctness of different variations as a histogram, and (4) allowing users to navigate through solutions interactively using semantic filters. With a multi-level view design, users can navigate high-level patterns, and low-level implementations. This is in contrast to prior tools that either limit their focus on isolated statements (and thus discard the surrounding context of those statements) or cluster entire code samples (which can lead to large numbers of clusters -- for example, if there are n code features and m implementations of each, there can be m^n clusters). We evaluated the effectiveness of CFlow with a comparison study, found participants using CFlow spent only half the time identifying mistakes and recalled twice as many desired patterns from over 6,000 submissions.
Paper Structure (44 sections, 2 equations, 4 figures, 2 tables)

This paper contains 44 sections, 2 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: CFlow allows users to explore the semantic flows of student code submissions at a high level through semantic aggregation. First, users are presented with an overview of the entire set of solutions (1), including the SAV (a) and the SHV (b). They can then click on individual code lines (f) to progressively explore the details of specific implementations. The visualization will update to focus on a smaller subset of solutions, displaying the aggregated flow and distributions of the selected set only (2). Users can inspect details of the flow at individual code level in the CDV (e). CFlow offers a breakdown of types of errors within that group (k), and detailed solutions with context-aware highlighting (h).
  • Figure 2: CFlow's algorithm. To generate the results required in CFlow's user interface, CFlow's algorithm include four primary stages: (1) identifying and tagging the steps required to solve a problem, (2) grouping and aligning lines of code across code samples, (3) identifying semantic, syntactic, and runtime errors, and (4) clustering the grouped results.
  • Figure 3: An example of how CFlow looks like without LLM determining line correctness. (a) is a collection of code lines that check the end of a word, and (b) is the correctness histogram. Upon selecting the prominent red block (c), users can view an example that incorrectly check the end of a word (d). By clicking on "LogicalError" (e), users are then able to explore detailed solutions (f).
  • Figure 4: The baseline system's user interface, derived from RunEx and incorporating OverCode's clustering results, features two main views: the Search Query View (a) and the Code List View (b). In the Code List View, each code block (g) represents a cluster of solutions with identical computation results, with a number in the upper left corner indicating the cluster size. Users can search for specific code patterns using runtime values and text matching (h), with each query displayed in the Search Query View alongside descriptive statistics (d). Queries can be entered directly into the search bar (c). Additionally, the system allows for set operations on these queries (e). Users can filter the code blocks in the Code List View by clicking on a query or using the checkbox (f).