Table of Contents
Fetching ...

ChatGPT in the classroom. Exploring its potential and limitations in a Functional Programming course

Dan-Matei Popovici

TL;DR

This study empirically evaluates ChatGPT in a Functional Programming (Scala) course, using a corpus of 72 tasks to measure correctness, readability, and educational value. It finds that the first ChatGPT response is correct in about 68% of cases, improving to 86% with follow-up prompts, yet only roughly half of those correct solutions are legible or instructional. ChatGPT demonstrates strong capabilities in automated code reviews, enabling a semi-automated feedback loop, though its usefulness as a sole learning tool is limited, particularly for harder tasks. The authors discuss mitigation strategies, compare ChatGPT with GitHub Copilot, and propose future directions, including integrating Copilot into FP curricula and developing automated, publicly accessible code-review tools to support educators. Overall, the work provides data-driven insights into how AI can augment programming education while highlighting the need for human oversight and pedagogical adaptation.

Abstract

In November 2022, OpenAI has introduced ChatGPT, a chatbot based on supervised and reinforcement learning. Not only can it answer questions emulating human-like responses, but it can also generate code from scratch or complete coding templates provided by the user. ChatGPT can generate unique responses which render any traditional anti-plagiarism tool useless. Its release has ignited a heated debate about its usage in academia, especially by students. We have found, to our surprise, that our students at POLITEHNICA University of Bucharest (UPB) have been using generative AI tools (ChatGPT and its predecessors) for solving homework, for at least 6 months. We therefore set out to explore the capabilities of ChatGPT and assess its value for educational purposes. We solved all our coding assignments for the semester from our UPB Functional Programming course. We discovered that, although ChatGPT provides correct answers in 68% of the cases, only around half of those are legible solutions which can benefit students in some form. On the other hand, ChatGPT has a very good ability to perform code review on student programming homework. Based on these findings, we discuss the pros and cons of ChatGPT in education.

ChatGPT in the classroom. Exploring its potential and limitations in a Functional Programming course

TL;DR

This study empirically evaluates ChatGPT in a Functional Programming (Scala) course, using a corpus of 72 tasks to measure correctness, readability, and educational value. It finds that the first ChatGPT response is correct in about 68% of cases, improving to 86% with follow-up prompts, yet only roughly half of those correct solutions are legible or instructional. ChatGPT demonstrates strong capabilities in automated code reviews, enabling a semi-automated feedback loop, though its usefulness as a sole learning tool is limited, particularly for harder tasks. The authors discuss mitigation strategies, compare ChatGPT with GitHub Copilot, and propose future directions, including integrating Copilot into FP curricula and developing automated, publicly accessible code-review tools to support educators. Overall, the work provides data-driven insights into how AI can augment programming education while highlighting the need for human oversight and pedagogical adaptation.

Abstract

In November 2022, OpenAI has introduced ChatGPT, a chatbot based on supervised and reinforcement learning. Not only can it answer questions emulating human-like responses, but it can also generate code from scratch or complete coding templates provided by the user. ChatGPT can generate unique responses which render any traditional anti-plagiarism tool useless. Its release has ignited a heated debate about its usage in academia, especially by students. We have found, to our surprise, that our students at POLITEHNICA University of Bucharest (UPB) have been using generative AI tools (ChatGPT and its predecessors) for solving homework, for at least 6 months. We therefore set out to explore the capabilities of ChatGPT and assess its value for educational purposes. We solved all our coding assignments for the semester from our UPB Functional Programming course. We discovered that, although ChatGPT provides correct answers in 68% of the cases, only around half of those are legible solutions which can benefit students in some form. On the other hand, ChatGPT has a very good ability to perform code review on student programming homework. Based on these findings, we discuss the pros and cons of ChatGPT in education.
Paper Structure (20 sections, 7 figures)

This paper contains 20 sections, 7 figures.

Figures (7)

  • Figure 1: Survey results: answers to questions (a) When did you hear about generative AI? and (b) How many times have you used generative AI for homework or other school activities?
  • Figure 2: Survey results: answers to questions (a) Has generative AI helped you gain a better understanding of curricula? and (b) Do you believe generative AI has good accuracy?
  • Figure 3: Breakdown of our dataset into: (a) Hard, medium and easy exercises (b) Exercises with simple/complex statements (c) small, medium or large-sized solutions.
  • Figure 4: Evaluation results: (a) ChatGPT exercise correctness rates and (b) correct test generation rates
  • Figure 5: Correctness of ChatGPT solutions per statement complexity, expressed as: (a) percentages (b) as absolute values from the dataset
  • ...and 2 more figures