Table of Contents
Fetching ...

Exploring Interaction Patterns for Debugging: Enhancing Conversational Capabilities of AI-assistants

Bhavya Chopra, Yasharth Bajpai, Param Biyani, Gustavo Soares, Arjun Radhakrishna, Chris Parnin, Sumit Gulwani

TL;DR

Problem: LLM-based debugging assistants often leap to actions with insufficient context, producing inaccurate responses. Approach: Robin, an IDE-integrated AI assistant, combines insert expansion, turn-taking, and debugging-workflow awareness, controlled by a Hardness Classifier, Eager Responder, Collaborative Responder, and Follow-up Generator. Findings: In a within-subject study with 12 industry professionals, Robin achieved a 5x improvement in bug resolution and better fault localization, with participants reporting actionable plans and lower conversation barriers. Significance: The results show that domain-specific interaction patterns and collaborative dialogue can meaningfully enhance debugging effectiveness and guide deeper IDE integration and personalization.

Abstract

The widespread availability of Large Language Models (LLMs) within Integrated Development Environments (IDEs) has led to their speedy adoption. Conversational interactions with LLMs enable programmers to obtain natural language explanations for various software development tasks. However, LLMs often leap to action without sufficient context, giving rise to implicit assumptions and inaccurate responses. Conversations between developers and LLMs are primarily structured as question-answer pairs, where the developer is responsible for asking the the right questions and sustaining conversations across multiple turns. In this paper, we draw inspiration from interaction patterns and conversation analysis -- to design Robin, an enhanced conversational AI-assistant for debugging. Through a within-subjects user study with 12 industry professionals, we find that equipping the LLM to -- (1) leverage the insert expansion interaction pattern, (2) facilitate turn-taking, and (3) utilize debugging workflows -- leads to lowered conversation barriers, effective fault localization, and 5x improvement in bug resolution rates.

Exploring Interaction Patterns for Debugging: Enhancing Conversational Capabilities of AI-assistants

TL;DR

Problem: LLM-based debugging assistants often leap to actions with insufficient context, producing inaccurate responses. Approach: Robin, an IDE-integrated AI assistant, combines insert expansion, turn-taking, and debugging-workflow awareness, controlled by a Hardness Classifier, Eager Responder, Collaborative Responder, and Follow-up Generator. Findings: In a within-subject study with 12 industry professionals, Robin achieved a 5x improvement in bug resolution and better fault localization, with participants reporting actionable plans and lower conversation barriers. Significance: The results show that domain-specific interaction patterns and collaborative dialogue can meaningfully enhance debugging effectiveness and guide deeper IDE integration and personalization.

Abstract

The widespread availability of Large Language Models (LLMs) within Integrated Development Environments (IDEs) has led to their speedy adoption. Conversational interactions with LLMs enable programmers to obtain natural language explanations for various software development tasks. However, LLMs often leap to action without sufficient context, giving rise to implicit assumptions and inaccurate responses. Conversations between developers and LLMs are primarily structured as question-answer pairs, where the developer is responsible for asking the the right questions and sustaining conversations across multiple turns. In this paper, we draw inspiration from interaction patterns and conversation analysis -- to design Robin, an enhanced conversational AI-assistant for debugging. Through a within-subjects user study with 12 industry professionals, we find that equipping the LLM to -- (1) leverage the insert expansion interaction pattern, (2) facilitate turn-taking, and (3) utilize debugging workflows -- leads to lowered conversation barriers, effective fault localization, and 5x improvement in bug resolution rates.
Paper Structure (12 sections, 4 figures, 2 tables)

This paper contains 12 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Contrasting conversations for the same task on both the AI-assistants. Text in purple represent the follow-up questions generated by the AI-assistants
  • Figure 2: Agent based workflow for Robin. The Hardness Classifier determines the bug severity and decides if it can be resolved in one-shot. The Eager Responder suggests a fix in a single-turn response for such errors. The Collaborative Responder engages in a multi-turn conversation, providing instructions to use the debugger and seeking additional information from the developer as need arises. The Follow-up Generator uses the conversation context to produce prompts that the user will likely say next.
  • Figure 3: Study Setup. Participants are exposed to the exception window (A), which has a "Chat with AI" button to invoke the AI-assistant (B). The assistant is available as a chat panel on the right side (C).
  • Figure 4: Time spent on each stage of the tasks, with and without AI assistance. Task 1 (Left), Task 2 (Right). Developers spent significantly higher time in localizing errors themselves with the baseline AI-assistant.