Table of Contents
Fetching ...

Designing for Novice Debuggers: A Pilot Study on an AI-Assisted Debugging Tool

Oka Kurniawan, Erick Chandra, Christopher M. Poskitt, Yannic Noller, Kenny Tsu Wei Choo, Cyrille Jegourel

TL;DR

This paper addresses the challenge of novice debugging, particularly semantic errors, by introducing CodeHinter, a Visual Studio Code extension that blends spectrum-based fault localization (FauxPy) with large-language-model–generated hints, quizzes, and print-statement guidance to promote active, incremental debugging. Unlike tools that provide full automated fixes, CodeHinter guides learners through diagnosing and fixing their own code, preserving user engagement and analytical reasoning. A pilot study (n=10) comparing CodeHinter to SID shows higher usability and satisfaction for CodeHinter (SUS ~75 vs ~65; overall ~89 vs ~60), with End-to-End Test and error localization identified as the most valuable features. The results support personalized AI assistance, suggesting that tailoring debugging support to user profiles and problem contexts can improve learning outcomes and reduce over-reliance on AI-generated solutions.

Abstract

Debugging is a fundamental skill that novice programmers must develop. Numerous tools have been created to assist novice programmers in this process. Recently, large language models (LLMs) have been integrated with automated program repair techniques to generate fixes for students' buggy code. However, many of these tools foster an over-reliance on AI and do not actively engage students in the debugging process. In this work, we aim to design an intuitive debugging assistant, CodeHinter, that combines traditional debugging tools with LLM-based techniques to help novice debuggers fix semantic errors while promoting active engagement in the debugging process. We present findings from our second design iteration, which we tested with a group of undergraduate students. Our results indicate that the students found the tool highly effective in resolving semantic errors and significantly easier to use than the first version. Consistent with our previous study, error localization was the most valuable feature. Finally, we conclude that any AI-assisted debugging approach should be personalized based on user profiles to optimize their interactions with the tool.

Designing for Novice Debuggers: A Pilot Study on an AI-Assisted Debugging Tool

TL;DR

This paper addresses the challenge of novice debugging, particularly semantic errors, by introducing CodeHinter, a Visual Studio Code extension that blends spectrum-based fault localization (FauxPy) with large-language-model–generated hints, quizzes, and print-statement guidance to promote active, incremental debugging. Unlike tools that provide full automated fixes, CodeHinter guides learners through diagnosing and fixing their own code, preserving user engagement and analytical reasoning. A pilot study (n=10) comparing CodeHinter to SID shows higher usability and satisfaction for CodeHinter (SUS ~75 vs ~65; overall ~89 vs ~60), with End-to-End Test and error localization identified as the most valuable features. The results support personalized AI assistance, suggesting that tailoring debugging support to user profiles and problem contexts can improve learning outcomes and reduce over-reliance on AI-generated solutions.

Abstract

Debugging is a fundamental skill that novice programmers must develop. Numerous tools have been created to assist novice programmers in this process. Recently, large language models (LLMs) have been integrated with automated program repair techniques to generate fixes for students' buggy code. However, many of these tools foster an over-reliance on AI and do not actively engage students in the debugging process. In this work, we aim to design an intuitive debugging assistant, CodeHinter, that combines traditional debugging tools with LLM-based techniques to help novice debuggers fix semantic errors while promoting active engagement in the debugging process. We present findings from our second design iteration, which we tested with a group of undergraduate students. Our results indicate that the students found the tool highly effective in resolving semantic errors and significantly easier to use than the first version. Consistent with our previous study, error localization was the most valuable feature. Finally, we conclude that any AI-assisted debugging approach should be personalized based on user profiles to optimize their interactions with the tool.

Paper Structure

This paper contains 14 sections, 4 figures.

Figures (4)

  • Figure 1: (Left) 'End-to-End Test' feature, the main feature of CodeHinter. The screenshot shows the state when users encounter failed test cases (corresponding to * in Figure \ref{['fig:user-flow']}), highlighting the expected value and actual output in the text editor while providing a brief explanation in the chatbot. (Right)CodeHinter 'Expand Menu', where users can access other features.
  • Figure 2: Flow of user interactions with our tool. The process begins with users pressing the 'End-to-End Test' button, which checks for syntax and semantic errors. Users are then guided to complete the missing debugging steps. If test cases fail, the tool provides options to help users resolve the bugs. The asterisk indicates the state of the screenshot in Figure \ref{['fig:main-screenshot']}.
  • Figure 3: User perspectives on the usefulness of CodeHinter's features for fixing semantic errors.
  • Figure 4: Frequency of access and utilization of features.