Programmer Visual Attention During Context-Aware Code Summarization
Robert Wallace, Aakash Bansal, Zachary Karas, Ningzhi Tang, Yu Huang, Toby Jia-Jun Li, Collin McMillan
TL;DR
The study addresses how programmers visually navigate project-level context during context-aware code summarization and whether gaze patterns relate to summary quality. It uses an IDE-based eye-tracking setup with 10 Java programmers across five GitHub projects to elicit 40 context-aware summaries per participant, enabling detailed analysis of gaze metrics and context usage. Key contributions include a dataset of 394 context-aware summaries, a comprehensive analysis of gaze patterns across task progression, and actionable guidance on which context types to prioritize for improving context-aware automatic summarization. The findings illuminate diminishing returns in context coverage and highlight that class-level context dominates attention, offering concrete guidance for distilling relevant context for AI-assisted code summarization and related tasks.
Abstract
Abridged: Programmer attention represents the visual focus of programmers on parts of the source code in pursuit of programming tasks. We conducted an in-depth human study with 10 Java programmers, where each programmer generated summaries for 40 methods from five large Java projects over five one-hour sessions. We used eye-tracking equipment to map the visual attention of programmers while they wrote the summaries. We also rate the quality of each summary. We found eye-gaze patterns and metrics that define common behaviors between programmer attention during context-aware code summarization. Specifically, we found that programmers need to read significantly (p<0.01) fewer words and make significantly (p<0.03) fewer revisits to words as they summarize more methods during a session, while maintaining the quality of summaries. We also found that the amount of source code a participant looks at correlates with a higher quality summary, but this trend follows a bell-shaped curve, such that after a threshold reading more source code leads to a significant (p<0.01) decrease in the quality of summaries. We also gathered insight into the type of methods in the project that provide the most contextual information for code summarization based on programmer attention. Specifically, we observed that programmers spent a majority of their time looking at methods inside the same class as the target method to be summarized. Surprisingly, we found that programmers spent significantly less time looking at methods in the call graph of the target method. We discuss how our empirical observations may aid future studies towards modeling programmer attention and improving context-aware automatic source code summarization.
