GazeCopilot: Evaluating Novel Gaze-Informed Prompting for AI-Supported Code Comprehension and Readability
Yasmine Elfares, Gül Çalikli, Mohamed Khamis
TL;DR
The paper tackles prompt quality in AI-assisted coding by introducing Real-time GazeCopilot, which uses real-time eye-tracking data to tailor prompts and refactor code for better comprehension and readability. Through a within-subject lab study with 25 developers, the authors compare Standard Copilot, Real-time GazeCopilot, and Pre-set GazeCopilot, measuring comprehension accuracy, time, readability, and user experience. Results show that gaze-informed prompts substantially improve comprehension accuracy and perceived readability, with faster processing and lower cognitive complexity for refactored code, while not degrading perceived agency. These findings support the potential of integrating real-time physiological signals into AI prompting, enabling more adaptive, user-centered human-AI collaboration in software development, albeit with attention to privacy and consent in real-world deployments.
Abstract
AI-powered coding assistants, like GitHub Copilot, are increasingly used to boost developers' productivity. However, their output quality hinges on the contextual richness of the prompts. Meanwhile, gaze behaviour carries rich cognitive information, providing insights into how developers process code. We leverage this in Real-time GazeCopilot, a novel approach that refines prompts using real-time gaze data to improve code comprehension and readability by integrating gaze metrics, like fixation patterns and pupil dilation, into prompts to adapt suggestions to developers' cognitive states. In a controlled lab study with 25 developers, we evaluated Real-time GazeCopilot against two baselines: Standard Copilot, which relies on text prompts provided by developers, and Pre-set GazeCopilot, which uses a hard-coded prompt that assumes developers' gaze metrics indicate they are struggling with all aspects of the code, allowing us to assess the impact of leveraging the developer's personal real-time gaze data. Our results show that prompts dynamically generated using developers' real-time gaze data significantly improve code comprehension accuracy, reduce comprehension time, and improve perceived readability compared to Standard Copilot. Our Real-time GazeCopilot approach selectively refactors only code aspects where gaze data indicate difficulty, outperforming the overgeneralized refactoring done by Pre-set GazeCopilot by avoiding revising code the developer already understands.
