How Beginning Programmers and Code LLMs (Mis)read Each Other
Sydney Nguyen, Hannah McLean Babe, Yangtian Zi, Arjun Guha, Carolyn Jane Anderson, Molly Q Feldman
TL;DR
This study investigates how near-novice programmers (CS1 graduates) interact with Code LLMs in a controlled NL-to-code task. Using a large-scale, multi-institution lab design, the authors isolate prompt writing and prompt editing across 48 problems, with automatic correctness feedback to measure success. They find that beginners struggle to describe problems in natural language and to edit prompts effectively, with non-deterministic model outputs adding to the difficulty. The results highlight persistent gaps between novice mental models and Code LLM behavior, raise equity concerns for first-generation students, and argue that teaching explicit NL-to-code prompting and strong code understanding remains essential. Overall, Code LLMs are not a universal shortcut for novice programmers; careful design of tools, pedagogy, and evaluation methods is needed to unlock their educational potential.
Abstract
Generative AI models, specifically large language models (LLMs), have made strides towards the long-standing goal of text-to-code generation. This progress has invited numerous studies of user interaction. However, less is known about the struggles and strategies of non-experts, for whom each step of the text-to-code problem presents challenges: describing their intent in natural language, evaluating the correctness of generated code, and editing prompts when the generated code is incorrect. This paper presents a large-scale controlled study of how 120 beginning coders across three academic institutions approach writing and editing prompts. A novel experimental design allows us to target specific steps in the text-to-code process and reveals that beginners struggle with writing and editing prompts, even for problems at their skill level and when correctness is automatically determined. Our mixed-methods evaluation provides insight into student processes and perceptions with key implications for non-expert Code LLM use within and outside of education.
