Table of Contents
Fetching ...

Can ChatGPT Play the Role of a Teaching Assistant in an Introductory Programming Course?

Anishka, Atharva Mehta, Nipun Gupta, Aarav Balachandran, Dhruv Kumar, Pankaj Jalote

TL;DR

This study evaluates whether ChatGPT can function as a virtual teaching assistant in an introductory programming course by examining its ability to grade student code and provide feedback. Using GPT-3.5 via API, two experiments compare ChatGPT’s outputs to human TAs across three Python CS1 assignments from a large Indian institution, applying functionality and Halstead/modularity-based quality assessments, as well as subjective and objective feedback evaluations. Results indicate limited reliability in grading accuracy and quality alignment with ground-truth metrics, while feedback generally enhances code modularity with modest or no gains in correctness. The work highlights both the potential and current limitations of LLM-based TAs for automated grading and personalized learning and points to future directions in prompt engineering and model improvements to achieve scalable, fair TA support.

Abstract

The emergence of Large language models (LLMs) is expected to have a major impact on education. This paper explores the potential of using ChatGPT, an LLM, as a virtual Teaching Assistant (TA) in an Introductory Programming Course. We evaluate ChatGPT's capabilities by comparing its performance with that of human TAs in some of the important TA functions. The TA functions which we focus on include (1) grading student code submissions, and (2) providing feedback to undergraduate students in an introductory programming course. Firstly, we assess ChatGPT's proficiency in grading student code submissions using a given grading rubric and compare its performance with the grades assigned by human TAs. Secondly, we analyze the quality and relevance of the feedback provided by ChatGPT. This evaluation considers how well ChatGPT addresses mistakes and offers suggestions for improvement in student solutions from both code correctness and code quality perspectives. We conclude with a discussion on the implications of integrating ChatGPT into computing education for automated grading, personalized learning experiences, and instructional support.

Can ChatGPT Play the Role of a Teaching Assistant in an Introductory Programming Course?

TL;DR

This study evaluates whether ChatGPT can function as a virtual teaching assistant in an introductory programming course by examining its ability to grade student code and provide feedback. Using GPT-3.5 via API, two experiments compare ChatGPT’s outputs to human TAs across three Python CS1 assignments from a large Indian institution, applying functionality and Halstead/modularity-based quality assessments, as well as subjective and objective feedback evaluations. Results indicate limited reliability in grading accuracy and quality alignment with ground-truth metrics, while feedback generally enhances code modularity with modest or no gains in correctness. The work highlights both the potential and current limitations of LLM-based TAs for automated grading and personalized learning and points to future directions in prompt engineering and model improvements to achieve scalable, fair TA support.

Abstract

The emergence of Large language models (LLMs) is expected to have a major impact on education. This paper explores the potential of using ChatGPT, an LLM, as a virtual Teaching Assistant (TA) in an Introductory Programming Course. We evaluate ChatGPT's capabilities by comparing its performance with that of human TAs in some of the important TA functions. The TA functions which we focus on include (1) grading student code submissions, and (2) providing feedback to undergraduate students in an introductory programming course. Firstly, we assess ChatGPT's proficiency in grading student code submissions using a given grading rubric and compare its performance with the grades assigned by human TAs. Secondly, we analyze the quality and relevance of the feedback provided by ChatGPT. This evaluation considers how well ChatGPT addresses mistakes and offers suggestions for improvement in student solutions from both code correctness and code quality perspectives. We conclude with a discussion on the implications of integrating ChatGPT into computing education for automated grading, personalized learning experiences, and instructional support.
Paper Structure (15 sections, 5 tables)