Table of Contents
Fetching ...

Programmer Visual Attention During Context-Aware Code Summarization

Robert Wallace, Aakash Bansal, Zachary Karas, Ningzhi Tang, Yu Huang, Toby Jia-Jun Li, Collin McMillan

TL;DR

The study addresses how programmers visually navigate project-level context during context-aware code summarization and whether gaze patterns relate to summary quality. It uses an IDE-based eye-tracking setup with 10 Java programmers across five GitHub projects to elicit 40 context-aware summaries per participant, enabling detailed analysis of gaze metrics and context usage. Key contributions include a dataset of 394 context-aware summaries, a comprehensive analysis of gaze patterns across task progression, and actionable guidance on which context types to prioritize for improving context-aware automatic summarization. The findings illuminate diminishing returns in context coverage and highlight that class-level context dominates attention, offering concrete guidance for distilling relevant context for AI-assisted code summarization and related tasks.

Abstract

Abridged: Programmer attention represents the visual focus of programmers on parts of the source code in pursuit of programming tasks. We conducted an in-depth human study with 10 Java programmers, where each programmer generated summaries for 40 methods from five large Java projects over five one-hour sessions. We used eye-tracking equipment to map the visual attention of programmers while they wrote the summaries. We also rate the quality of each summary. We found eye-gaze patterns and metrics that define common behaviors between programmer attention during context-aware code summarization. Specifically, we found that programmers need to read significantly (p<0.01) fewer words and make significantly (p<0.03) fewer revisits to words as they summarize more methods during a session, while maintaining the quality of summaries. We also found that the amount of source code a participant looks at correlates with a higher quality summary, but this trend follows a bell-shaped curve, such that after a threshold reading more source code leads to a significant (p<0.01) decrease in the quality of summaries. We also gathered insight into the type of methods in the project that provide the most contextual information for code summarization based on programmer attention. Specifically, we observed that programmers spent a majority of their time looking at methods inside the same class as the target method to be summarized. Surprisingly, we found that programmers spent significantly less time looking at methods in the call graph of the target method. We discuss how our empirical observations may aid future studies towards modeling programmer attention and improving context-aware automatic source code summarization.

Programmer Visual Attention During Context-Aware Code Summarization

TL;DR

The study addresses how programmers visually navigate project-level context during context-aware code summarization and whether gaze patterns relate to summary quality. It uses an IDE-based eye-tracking setup with 10 Java programmers across five GitHub projects to elicit 40 context-aware summaries per participant, enabling detailed analysis of gaze metrics and context usage. Key contributions include a dataset of 394 context-aware summaries, a comprehensive analysis of gaze patterns across task progression, and actionable guidance on which context types to prioritize for improving context-aware automatic summarization. The findings illuminate diminishing returns in context coverage and highlight that class-level context dominates attention, offering concrete guidance for distilling relevant context for AI-assisted code summarization and related tasks.

Abstract

Abridged: Programmer attention represents the visual focus of programmers on parts of the source code in pursuit of programming tasks. We conducted an in-depth human study with 10 Java programmers, where each programmer generated summaries for 40 methods from five large Java projects over five one-hour sessions. We used eye-tracking equipment to map the visual attention of programmers while they wrote the summaries. We also rate the quality of each summary. We found eye-gaze patterns and metrics that define common behaviors between programmer attention during context-aware code summarization. Specifically, we found that programmers need to read significantly (p<0.01) fewer words and make significantly (p<0.03) fewer revisits to words as they summarize more methods during a session, while maintaining the quality of summaries. We also found that the amount of source code a participant looks at correlates with a higher quality summary, but this trend follows a bell-shaped curve, such that after a threshold reading more source code leads to a significant (p<0.01) decrease in the quality of summaries. We also gathered insight into the type of methods in the project that provide the most contextual information for code summarization based on programmer attention. Specifically, we observed that programmers spent a majority of their time looking at methods inside the same class as the target method to be summarized. Surprisingly, we found that programmers spent significantly less time looking at methods in the call graph of the target method. We discuss how our empirical observations may aid future studies towards modeling programmer attention and improving context-aware automatic source code summarization.
Paper Structure (26 sections, 4 figures, 5 tables)

This paper contains 26 sections, 4 figures, 5 tables.

Figures (4)

  • Figure 1: A screenshot of our interface. The blue box highlights the summary generated by the participants, while the red box highlights the navigational window limited to one project. The participant is free to open and read any part of the project.
  • Figure 2: Graphs illustrating the distribution of programmer attention for varying groups defined on the X-axis as (a) project names, (b) the order in which the session occurred, (c) the order in which the method was seen, and (d) the participant ID. The legend on top is common for the bar colors on all graphs. The Y-axis in each graph shows mean value for each type of context. The values were normalized by the total time spent fixating on context outside the target method for each method-summary pair, prior to computing mean for each category.
  • Figure 3: Bar chart showing the delta values for gaze metrics computed between summaries rated low(<=3) and highly(5) in terms of completeness and conciseness.
  • Figure 4: Example with source code of a method in project mallet, accompanied by summaries written by participant 3, 4, and 10.