Table of Contents
Fetching ...

SocialEyes: Scaling mobile eye-tracking to multi-person social settings

Shreshth Saxena, Areez Visram, Neil Lobo, Zahid Mirza, Mehak Rafi Khan, Biranugan Pirabaharan, Alexander Nguyen, Lauren K. Fink

TL;DR

SocialEyes addresses the challenge of scaling eye-tracking to multi-person social settings by synchronizing and mapping gaze data from egocentric views to a shared centralview using a planar homography. The framework integrates gaze streams, egoview and centralview videos, and modular components (GlassesRecord, CentralCam, GlassesStream) bridged by time synchronization and Kafka-based streaming, with visualization and analysis tools for collective gaze dynamics. Validated in live events with 60 participants, the system shows precise synchronization (mean offsets ~20–45 ms) and robust gaze projection in dynamic scenes, enabling heatmap-based and temporal analyses of group attention. This work enhances ecological validity in eye-tracking and enables scalable data collection, real-time monitoring, and novel insights into social attention and collective behavior.

Abstract

Eye movements provide a window into human behaviour, attention, and interaction dynamics. Challenges in real-world, multi-person environments have, however, restrained eye-tracking research predominantly to single-person, in-lab settings. We developed a system to stream, record, and analyse synchronised data from multiple mobile eye-tracking devices during collective viewing experiences (e.g., concerts, films, lectures). We implemented lightweight operator interfaces for real-time-monitoring, remote-troubleshooting, and gaze-projection from individual egocentric perspectives to a common coordinate space for shared gaze analysis. We tested the system in a live concert and a film screening with 30 simultaneous viewers during each of two public events (N=60). We observe precise time-synchronisation between devices measured through recorded clock-offsets, and accurate gaze-projection in challenging dynamic scenes. Our novel analysis metrics and visualizations illustrate the potential of collective eye-tracking data for understanding collaborative behaviour and social interaction. This advancement promotes ecological validity in eye-tracking research and paves the way for innovative interactive tools.

SocialEyes: Scaling mobile eye-tracking to multi-person social settings

TL;DR

SocialEyes addresses the challenge of scaling eye-tracking to multi-person social settings by synchronizing and mapping gaze data from egocentric views to a shared centralview using a planar homography. The framework integrates gaze streams, egoview and centralview videos, and modular components (GlassesRecord, CentralCam, GlassesStream) bridged by time synchronization and Kafka-based streaming, with visualization and analysis tools for collective gaze dynamics. Validated in live events with 60 participants, the system shows precise synchronization (mean offsets ~20–45 ms) and robust gaze projection in dynamic scenes, enabling heatmap-based and temporal analyses of group attention. This work enhances ecological validity in eye-tracking and enables scalable data collection, real-time monitoring, and novel insights into social attention and collective behavior.

Abstract

Eye movements provide a window into human behaviour, attention, and interaction dynamics. Challenges in real-world, multi-person environments have, however, restrained eye-tracking research predominantly to single-person, in-lab settings. We developed a system to stream, record, and analyse synchronised data from multiple mobile eye-tracking devices during collective viewing experiences (e.g., concerts, films, lectures). We implemented lightweight operator interfaces for real-time-monitoring, remote-troubleshooting, and gaze-projection from individual egocentric perspectives to a common coordinate space for shared gaze analysis. We tested the system in a live concert and a film screening with 30 simultaneous viewers during each of two public events (N=60). We observe precise time-synchronisation between devices measured through recorded clock-offsets, and accurate gaze-projection in challenging dynamic scenes. Our novel analysis metrics and visualizations illustrate the potential of collective eye-tracking data for understanding collaborative behaviour and social interaction. This advancement promotes ecological validity in eye-tracking research and paves the way for innovative interactive tools.
Paper Structure (44 sections, 13 figures, 2 tables)

This paper contains 44 sections, 13 figures, 2 tables.

Figures (13)

  • Figure 1: LEFT. Schematic of a shared scene depicting multiple people wearing mobile eye-tracking glasses gazing at a stage. A central camera at the back of the audience records the shared scene (centralview). MIDDLE. Egoviews of each glasses wearer need to be projected onto the centralview. Point correspondences between the egocentric and centralviews are used to calculate homography matrices that relate the two views. RIGHT. Using the computed homography matrices to remap the gaze coordinates, all individuals’ gaze points can be projected onto the shared scene (centralview).
  • Figure 2: Software framework demonstrating the flow (arrow lines) of data streams (ellipses) in the system. Solid and dashed lines represent the recording and streaming mode of operation respectively. Each mode of operation engages separate modules (dog-eared rectangles) that process the incoming data.
  • Figure 3: A. represents the stage setup during the concert part of the event. The film was presented on the video wall visible behind the performers. During the film, the stage was cleared and/or covered. B. shows audience chairs with eye-tracking glasses attached to each seat. The eye-tracking glasses were connected to companion smartphones that were secured in pockets attached on either side of each pair of seats.
  • Figure 4: A. Mean time offset (left) and mean roundtrip duration (right), in ms, recorded for the two sessions on each day. The coloured dots represent mean offset for each device over the respective session (x-axis); black dots represent mean offset across devices with the error bars representing standard error. B. Offset drift calculated for each device (x-axis) in each of the four sessions. The vertical bars represent standard deviation, horizontal bars represent the mean and individual dots represent single data points at measured timepoints.
  • Figure 5: A. An example of feature matching from one frame of one participant’s egoview (left) onto the centralview (right). The recorded gaze corresponding to this frame is represented by the maroon and white circle. B. Multi-person views generated with the visualisation module. Top: All participants’ egoviews and gazes (small white and black circles) are displayed around the perimeter. SocialEyes projects each participant's gaze onto the centralview (centre). In the centre panel, participants’ gazes are represented with uniquely coloured circles with white in the middle to aid visibility. Bottom: Same as top, with the centre grid cell displaying a heatmap of the 2D gaze density of all participants looking at the scene. Higher intensities in the heatmap (intensity increases from purple to yellow with yellow being the highest) represent a higher proportion of the participants looking at that location. C. Same as centre panels in B, but resized to aid visibility.
  • ...and 8 more figures