Table of Contents
Fetching ...

Test Case-Informed Knowledge Tracing for Open-ended Coding Tasks

Zhangqi Duan, Nigel Fernandez, Alexander Hicks, Andrew Lan

TL;DR

This paper tackles the challenge of knowledge tracing for open-ended coding tasks by introducing TIKTOC, a multi-task framework that simultaneously predicts per-test-case pass/fail outcomes and generates student code, using a large language model backbone. To enable this, the authors augment the CodeWorkout dataset with authentic test cases and formulate KT at the test-case level, enabling fine-grained diagnostics of student knowledge beyond overall correctness. TIKTOC combines test-case level prediction with code generation in a unified objective, leveraging Llama 3 and an LSTM-based knowledge state, and demonstrates significant improvements over strong baselines on both test-case accuracy (AUC) and code quality (CodeBLEU), with supporting ablations and qualitative case studies. The work provides insights into how test-case information and open-ended code can jointly inform student models and offers practical use cases for CS education, such as anticipatory feedback, targeted hints, and curriculum design. Public release of the augmented dataset further provides a benchmark for future test-case-level KT methods in programming education.

Abstract

Open-ended coding tasks, which ask students to construct programs according to certain specifications, are common in computer science education. Student modeling can be challenging since their open-ended nature means that student code can be diverse. Traditional knowledge tracing (KT) models that only analyze response correctness may not fully capture nuances in student knowledge from student code. In this paper, we introduce Test case-Informed Knowledge Tracing for Open-ended Coding (TIKTOC), a framework to simultaneously analyze and predict both open-ended student code and whether the code passes each test case. We augment the existing CodeWorkout dataset with the test cases used for a subset of the open-ended coding questions, and propose a multi-task learning KT method to simultaneously analyze and predict 1) whether a student's code submission passes each test case and 2) the student's open-ended code, using a large language model as the backbone. We quantitatively show that these methods outperform existing KT methods for coding that only use the overall score a code submission receives. We also qualitatively demonstrate how test case information, combined with open-ended code, helps us gain fine-grained insights into student knowledge.

Test Case-Informed Knowledge Tracing for Open-ended Coding Tasks

TL;DR

This paper tackles the challenge of knowledge tracing for open-ended coding tasks by introducing TIKTOC, a multi-task framework that simultaneously predicts per-test-case pass/fail outcomes and generates student code, using a large language model backbone. To enable this, the authors augment the CodeWorkout dataset with authentic test cases and formulate KT at the test-case level, enabling fine-grained diagnostics of student knowledge beyond overall correctness. TIKTOC combines test-case level prediction with code generation in a unified objective, leveraging Llama 3 and an LSTM-based knowledge state, and demonstrates significant improvements over strong baselines on both test-case accuracy (AUC) and code quality (CodeBLEU), with supporting ablations and qualitative case studies. The work provides insights into how test-case information and open-ended code can jointly inform student models and offers practical use cases for CS education, such as anticipatory feedback, targeted hints, and curriculum design. Public release of the augmented dataset further provides a benchmark for future test-case-level KT methods in programming education.

Abstract

Open-ended coding tasks, which ask students to construct programs according to certain specifications, are common in computer science education. Student modeling can be challenging since their open-ended nature means that student code can be diverse. Traditional knowledge tracing (KT) models that only analyze response correctness may not fully capture nuances in student knowledge from student code. In this paper, we introduce Test case-Informed Knowledge Tracing for Open-ended Coding (TIKTOC), a framework to simultaneously analyze and predict both open-ended student code and whether the code passes each test case. We augment the existing CodeWorkout dataset with the test cases used for a subset of the open-ended coding questions, and propose a multi-task learning KT method to simultaneously analyze and predict 1) whether a student's code submission passes each test case and 2) the student's open-ended code, using a large language model as the backbone. We quantitatively show that these methods outperform existing KT methods for coding that only use the overall score a code submission receives. We also qualitatively demonstrate how test case information, combined with open-ended code, helps us gain fine-grained insights into student knowledge.

Paper Structure

This paper contains 30 sections, 8 equations, 2 figures, 7 tables.

Figures (2)

  • Figure 2: Statistics of our test case-augmented CodeWorkout dataset with $3714$ student code submissions.
  • Figure 2: Overview of TIKTOC's model architecture with the Llama 3 LLM as the backbone. TIKTOC embeds the student's previous open-ended code to update the student's knowledge estimate, which is then combined with the current problem, as input to Llama 3. TIKTOC simultaneously learns to predict both 1) whether a student’s code submission passes each test, and 2) the student’s open-ended code, with a multi-task learning setup.