Cracking the Code: Evaluating Zero-Shot Prompting Methods for Providing Programming Feedback

Niklas Ippisch; Anna-Carolina Haensch; Jan Simson; Jacob Beck; Markus Herklotz; Malte Schierholz

Cracking the Code: Evaluating Zero-Shot Prompting Methods for Providing Programming Feedback

Niklas Ippisch, Anna-Carolina Haensch, Jan Simson, Jacob Beck, Markus Herklotz, Malte Schierholz

TL;DR

This work addresses how to elicit high-quality feedback from large language models for beginner programming errors. It introduces a structured evaluation framework, grounded in Ryan et al. (2020), and compares four zero-shot prompting strategies—Chain of Thought, Prompt Chaining, Tree of Thought, and ReAct—plus a vanilla baseline, in the context of beginner R programming errors. Key findings indicate that enforcing a stepwise process enhances feedback precision, while omitting explicit data references can improve error identification, highlighting a trade-off between localization and actionable remediation. The framework is designed to be transferable to other programming languages and coding tasks, supporting educators and researchers in systematically assessing LLM-derived feedback quality.

Abstract

Despite the growing use of large language models (LLMs) for providing feedback, limited research has explored how to achieve high-quality feedback. This case study introduces an evaluation framework to assess different zero-shot prompt engineering methods. We varied the prompts systematically and analyzed the provided feedback on programming errors in R. The results suggest that prompts suggesting a stepwise procedure increase the precision, while omitting explicit specifications about which provided data to analyze improves error identification.

Cracking the Code: Evaluating Zero-Shot Prompting Methods for Providing Programming Feedback

TL;DR

Abstract

Cracking the Code: Evaluating Zero-Shot Prompting Methods for Providing Programming Feedback

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (1)