Table of Contents
Fetching ...

eye2vec: Learning Distributed Representations of Eye Movement for Program Comprehension Analysis

Haruhiko Yoshioka, Kazumasa Shimari, Hidetake Uwano, Kenichi Matsumoto

TL;DR

eye2vec addresses the challenge of semantically interpreting developers' eye movements during code reading by mapping fixations to syntactic elements and leveraging path-context embeddings derived from code2vec. The approach creates eye vectors that capture higher-order semantic relationships, enabling automated data-mining analyses and label prediction of developer characteristics. The main contributions are a novel architecture linking eye-tracking data with semantic code representations and concrete use cases that demonstrate data-driven insights and predictive capabilities. This work offers a scalable, semantically grounded framework for studying program comprehension that can inform education and tooling, with future work focused on applying deep learning techniques to the distributed representations while preserving privacy of input data.

Abstract

This paper presents eye2vec, an infrastructure for analyzing software developers' eye movements while reading source code. In common eye-tracking studies in program comprehension, researchers must preselect analysis targets such as control flow or syntactic elements, and then develop analysis methods to extract appropriate metrics from the fixation for source code. Here, researchers can define various levels of AOIs like words, lines, or code blocks, and the difference leads to different results. Moreover, the interpretation of fixation for word/line can vary across the purposes of the analyses. Hence, the eye-tracking analysis is a difficult task that depends on the time-consuming manual work of the researchers. eye2vec represents continuous two fixations as transitions between syntactic elements using distributed representations. The distributed representation facilitates the adoption of diverse data analysis methods with rich semantic interpretations.

eye2vec: Learning Distributed Representations of Eye Movement for Program Comprehension Analysis

TL;DR

eye2vec addresses the challenge of semantically interpreting developers' eye movements during code reading by mapping fixations to syntactic elements and leveraging path-context embeddings derived from code2vec. The approach creates eye vectors that capture higher-order semantic relationships, enabling automated data-mining analyses and label prediction of developer characteristics. The main contributions are a novel architecture linking eye-tracking data with semantic code representations and concrete use cases that demonstrate data-driven insights and predictive capabilities. This work offers a scalable, semantically grounded framework for studying program comprehension that can inform education and tooling, with future work focused on applying deep learning techniques to the distributed representations while preserving privacy of input data.

Abstract

This paper presents eye2vec, an infrastructure for analyzing software developers' eye movements while reading source code. In common eye-tracking studies in program comprehension, researchers must preselect analysis targets such as control flow or syntactic elements, and then develop analysis methods to extract appropriate metrics from the fixation for source code. Here, researchers can define various levels of AOIs like words, lines, or code blocks, and the difference leads to different results. Moreover, the interpretation of fixation for word/line can vary across the purposes of the analyses. Hence, the eye-tracking analysis is a difficult task that depends on the time-consuming manual work of the researchers. eye2vec represents continuous two fixations as transitions between syntactic elements using distributed representations. The distributed representation facilitates the adoption of diverse data analysis methods with rich semantic interpretations.

Paper Structure

This paper contains 6 sections, 1 figure.

Figures (1)

  • Figure 1: Overview of the eye2vec architecture