Table of Contents
Fetching ...

Mining Hierarchies with Conviction: Constructing the CS1 Skill Hierarchy with Pairwise Comparisons over Skill Distributions

Dip Kiran Pradhan Newar, Max Fowler, David H. Smith, Seth Poulsen

TL;DR

This work addresses the problem of establishing prerequisite relationships among five CS1 programming skills by applying a directional Conviction measure from association rule mining to binarized exam data collected from four CP1 Python exams (n > 600). The authors binarize scores using multiple thresholds and perform pairwise analyses across ten skill pairs with Wilcoxon tests and Bonferroni correction, finding that Tracing reliably precedes Write, Explain, and Sequence, while Write more often precedes Explain under a mean-threshold but not consistently under a median-threshold. They also observe co-requisite patterns such as Seq↔Explain and Write↔Seq, suggesting tighter coupling among some skills. The study contributes a data-driven, direction-aware skill hierarchy that can guide CS1 teaching sequences and assessments, while noting limitations related to scoring methods and calling for further validation across contexts.

Abstract

Background and Context: Some skills taught in introductory programming courses are categorized into 1) explaining code, 2) arranging lines of code in correct sequence, 3) tracing through the execution of a program, and 4) writing code from scratch. Objective: Knowing if a programming skill is a prerequisite to another would benefit teachers in properly planning the course and structuring the order in which they present activities relating to new content. Prior attempts to establish a skill hierarchy have suffered from methodological issues. Method: In this study, we used the conviction measure from association rule mining to perform pair-wise comparisons of five skills: Write, Trace, Reverse trace, Sequence, and Explain code. We used the data from four exams with more than 600 participants where students solved programming assignments of different skills for several programming topics. Findings: Our findings matched the previous finding that tracing is a prerequisite for students to learn to write code. Contradicting the previous claims, our analysis showed that using the mean threshold writing code is a prerequisite to explaining code. However, there is no clear relationship when we change the threshold to the median. Unlike prior work, we did not find a clear prerequisite relationship between sequencing code and writing or explaining code. Implications: Our research can help instructors by systematically arranging the skills students exercise when encountering a new topic. The goal is to help instructors properly teach and assess programming in a fashion most effective for learning by leveraging the relationship between skills.

Mining Hierarchies with Conviction: Constructing the CS1 Skill Hierarchy with Pairwise Comparisons over Skill Distributions

TL;DR

This work addresses the problem of establishing prerequisite relationships among five CS1 programming skills by applying a directional Conviction measure from association rule mining to binarized exam data collected from four CP1 Python exams (n > 600). The authors binarize scores using multiple thresholds and perform pairwise analyses across ten skill pairs with Wilcoxon tests and Bonferroni correction, finding that Tracing reliably precedes Write, Explain, and Sequence, while Write more often precedes Explain under a mean-threshold but not consistently under a median-threshold. They also observe co-requisite patterns such as Seq↔Explain and Write↔Seq, suggesting tighter coupling among some skills. The study contributes a data-driven, direction-aware skill hierarchy that can guide CS1 teaching sequences and assessments, while noting limitations related to scoring methods and calling for further validation across contexts.

Abstract

Background and Context: Some skills taught in introductory programming courses are categorized into 1) explaining code, 2) arranging lines of code in correct sequence, 3) tracing through the execution of a program, and 4) writing code from scratch. Objective: Knowing if a programming skill is a prerequisite to another would benefit teachers in properly planning the course and structuring the order in which they present activities relating to new content. Prior attempts to establish a skill hierarchy have suffered from methodological issues. Method: In this study, we used the conviction measure from association rule mining to perform pair-wise comparisons of five skills: Write, Trace, Reverse trace, Sequence, and Explain code. We used the data from four exams with more than 600 participants where students solved programming assignments of different skills for several programming topics. Findings: Our findings matched the previous finding that tracing is a prerequisite for students to learn to write code. Contradicting the previous claims, our analysis showed that using the mean threshold writing code is a prerequisite to explaining code. However, there is no clear relationship when we change the threshold to the median. Unlike prior work, we did not find a clear prerequisite relationship between sequencing code and writing or explaining code. Implications: Our research can help instructors by systematically arranging the skills students exercise when encountering a new topic. The goal is to help instructors properly teach and assess programming in a fashion most effective for learning by leveraging the relationship between skills.

Paper Structure

This paper contains 16 sections, 8 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Student score distribution for all five skills. For explain, the scores are in discrete format. The rest four skills have continuous student score ranging from 0 to 1. The white dot in the box of the violin plot denotes the median of each of the five skills. The bar shows the quartile scores.
  • Figure 2: Visualization of conviction distributions for skills with conviction score calculated using mean threshold. Dots show medians, while the error bars shows the Q1 and Q3 score from the median. A point being close to the origin $(1,1)$ would represent no relationship between skills. A dot close to the y-axis, away from the origin implies a strong prerequisite relationship, since in that case $\textit{Conviction}(A \Rightarrow B)$ is near 1 while $\textit{Conviction}(B \Rightarrow A)$ is high. A dot midway between the dashed line and the y-axis represents a weaker prerequisite relationship. Dots close to the dashed line denote skills that are correlated but neither is a prerequisite of the other.
  • Figure 3: Visualization of conviction distributions for skills with conviction score calculated using median threshold. Dots show medians, while the error bars shows the Q1 and Q3 score from the median. A point being close to the origin $(1,1)$ would represent no relationship between skills. A dot close to the y-axis, away from the origin implies a strong prerequisite relationship, since in that case $\textit{Conviction}(A \Rightarrow B)$ is near 1 while $\textit{Conviction}(B \Rightarrow A)$ is high. A dot midway between the dashed line and the y-axis represents a weaker prerequisite relationship. Dots close to the dashed line denote skills that are correlated but neither is a prerequisite of the other.
  • Figure 4: Potential hierarchy Structure of the Programming Skills based on dependency. Solid line shows the prerequisiteness of a skill to a skill pointed with arrow for both threshold we used. Dotted line shows the prerequisiteness in only one of the threshold.