EgoCampus: Egocentric Pedestrian Eye Gaze Model and Dataset
Ronan John, Aditya Kesari, Vincenzo DiMatteo, Kristin Dana
TL;DR
This work introduces EgoCampus, a large outdoor egocentric gaze dataset collected with Project Aria glasses, and EgoCampusNet (ECN), a spatio-temporal fusion model that predicts pedestrian gaze heatmaps from egocentric video. By leveraging synchronized RGB video, eye gaze, IMU, GPS, and other sensors across 82 participants and 25 campus paths, the authors demonstrate strong gaze-prediction performance using temporal video features and a query-frame encoder. ECN achieves state-of-the-art results across standard gaze-saliency metrics (AUC-Judd, CC, KLD, SIM, NSS) and outperforms pretrained baselines that were not trained on EgoCampus, emphasizing the value of environment-aware, navigation-driven gaze modeling. The dataset and model pave the way for improved navigation and robot-human interaction in real-world settings, with additional resources like the YOPO-Campus robot-view dataset to support multimodal navigation research.
Abstract
We address the challenge of predicting human visual attention during real-world navigation by measuring and modeling egocentric pedestrian eye gaze in an outdoor campus setting. We introduce the EgoCampus dataset, which spans 25 unique outdoor paths over 6 km across a university campus with recordings from more than 80 distinct human pedestrians, resulting in a diverse set of gaze-annotated videos. The system used for collection, Meta's Project Aria glasses, integrates eye tracking, front-facing RGB cameras, inertial sensors, and GPS to provide rich data from the human perspective. Unlike many prior egocentric datasets that focus on indoor tasks or exclude eye gaze information, our work emphasizes visual attention while subjects walk in outdoor campus paths. Using this data, we develop EgoCampusNet, a novel method to predict eye gaze of navigating pedestrians as they move through outdoor environments. Our contributions provide both a new resource for studying real-world attention and a resource for future work in gaze prediction models for navigation. Dataset and code are available upon request, and will be made publicly available at a later date at https://github.com/ComputerVisionRutgers/EgoCampus .
