Table of Contents
Fetching ...

iTrace: Click-Based Gaze Visualization on the Apple Vision Pro

Esra Mehmedova, Santiago Berrezueta-Guzman, Stefan Wagner

TL;DR

iTrace presents a click-based gaze extraction framework to overcome Apple Vision Pro's privacy-imposed absence of continuous raw gaze data. A two-component client-server pipeline (Vision Pro Swift client and Python Flask server) converts interaction events into video and spatial heatmaps, enabling both individual and averaged gaze visualizations. In a between-subjects study with 20 participants, a gaming controller achieved markedly higher data rates (~14.22 clicks/s) than dwell control (~0.45 clicks/s), producing denser heatmaps while maintaining precision around $92\%$. The work demonstrates broad applications across education, environmental design, marketing, and clinical science, while acknowledging privacy limitations and advocating use in research settings; the authors provide open-source code for reproducibility.

Abstract

The Apple Vision Pro is equipped with accurate eye-tracking capabilities, yet the privacy restrictions on the device prevent direct access to continuous user gaze data. This study introduces iTrace, a novel application that overcomes these limitations through click-based gaze extraction techniques, including manual methods like a pinch gesture, and automatic approaches utilizing dwell control or a gaming controller. We developed a system with a client-server architecture that captures the gaze coordinates and transforms them into dynamic heatmaps for video and spatial eye tracking. The system can generate individual and averaged heatmaps, enabling analysis of personal and collective attention patterns. To demonstrate its effectiveness and evaluate the usability and performance, a study was conducted with two groups of 10 participants, each testing different clicking methods. The 8BitDo controller achieved higher average data collection rates at 14.22 clicks/s compared to 0.45 clicks/s with dwell control, enabling significantly denser heatmap visualizations. The resulting heatmaps reveal distinct attention patterns, including concentrated focus in lecture videos and broader scanning during problem-solving tasks. By allowing dynamic attention visualization while maintaining a high gaze precision of 91 %, iTrace demonstrates strong potential for a wide range of applications in educational content engagement, environmental design evaluation, marketing analysis, and clinical cognitive assessment. Despite the current gaze data restrictions on the Apple Vision Pro, we encourage developers to use iTrace only in research settings.

iTrace: Click-Based Gaze Visualization on the Apple Vision Pro

TL;DR

iTrace presents a click-based gaze extraction framework to overcome Apple Vision Pro's privacy-imposed absence of continuous raw gaze data. A two-component client-server pipeline (Vision Pro Swift client and Python Flask server) converts interaction events into video and spatial heatmaps, enabling both individual and averaged gaze visualizations. In a between-subjects study with 20 participants, a gaming controller achieved markedly higher data rates (~14.22 clicks/s) than dwell control (~0.45 clicks/s), producing denser heatmaps while maintaining precision around . The work demonstrates broad applications across education, environmental design, marketing, and clinical science, while acknowledging privacy limitations and advocating use in research settings; the authors provide open-source code for reproducibility.

Abstract

The Apple Vision Pro is equipped with accurate eye-tracking capabilities, yet the privacy restrictions on the device prevent direct access to continuous user gaze data. This study introduces iTrace, a novel application that overcomes these limitations through click-based gaze extraction techniques, including manual methods like a pinch gesture, and automatic approaches utilizing dwell control or a gaming controller. We developed a system with a client-server architecture that captures the gaze coordinates and transforms them into dynamic heatmaps for video and spatial eye tracking. The system can generate individual and averaged heatmaps, enabling analysis of personal and collective attention patterns. To demonstrate its effectiveness and evaluate the usability and performance, a study was conducted with two groups of 10 participants, each testing different clicking methods. The 8BitDo controller achieved higher average data collection rates at 14.22 clicks/s compared to 0.45 clicks/s with dwell control, enabling significantly denser heatmap visualizations. The resulting heatmaps reveal distinct attention patterns, including concentrated focus in lecture videos and broader scanning during problem-solving tasks. By allowing dynamic attention visualization while maintaining a high gaze precision of 91 %, iTrace demonstrates strong potential for a wide range of applications in educational content engagement, environmental design evaluation, marketing analysis, and clinical cognitive assessment. Despite the current gaze data restrictions on the Apple Vision Pro, we encourage developers to use iTrace only in research settings.

Paper Structure

This paper contains 52 sections, 15 figures, 1 table.

Figures (15)

  • Figure 1: The iTrace pipeline for click‑based gaze mapping on the Apple Vision Pro—(left) video eye tracking: the Swift app captures and sends the gaze data to the server to produce heatmap videos; (right) spatial eye tracking: the application triggers environment recording and gaze capture, then the server overlays heatmaps on the mirrored footage.
  • Figure 2: Click‐based interaction methods: (top) pinch gesture, (middle) dwell control, and (bottom) gaming controller.
  • Figure 3: Precision calibration interface: users tap the center cross to measure eye‐tracking accuracy, with the red marker indicating the recorded gaze point and the resulting precision score displayed below.
  • Figure 4: Clicking speed assessment interface: users tap the circle repeatedly to fill it, and the measured click rate is displayed upon completion.
  • Figure 5: Heatmap of an increasing number of gaze points from left to right
  • ...and 10 more figures