Neural Radiance and Gaze Fields for Visual Attention Modeling in 3D Environments
Andrei Chubarau, Yinan Wang, James J. Clark
TL;DR
NeRGs address the challenge of visualizing gaze in 3D environments by augmenting a fixed NeRF with a lightweight gaze module to render both scene appearance and a 3D gaze density on surfaces. The system supports decoupled observer and rendering viewpoints and explicitly handles gaze occlusion via depth-based tests. Training uses head-pose data as a gaze proxy and aggregates rays into gaze probes to supervise a NeRF-based gaze predictor, achieving interactive 3D gaze visualization in real-world scenes. This geometry-aware, real-time approach enables flexible exploration of attention in complex 3D environments with potential applications in signage design, VR/AR interfaces, and shopper behavior analysis.
Abstract
We introduce Neural Radiance and Gaze Fields (NeRGs), a novel approach for representing visual attention in complex environments. Much like how Neural Radiance Fields (NeRFs) perform novel view synthesis, NeRGs reconstruct gaze patterns from arbitrary viewpoints, implicitly mapping visual attention to 3D surfaces. We achieve this by augmenting a standard NeRF with an additional network that models local egocentric gaze probability density, conditioned on scene geometry and observer position. The output of a NeRG is a rendered view of the scene alongside a pixel-wise salience map representing the conditional probability that a given observer fixates on visible surfaces. Unlike prior methods, our system is lightweight and enables visualization of gaze fields at interactive framerates. Moreover, NeRGs allow the observer perspective to be decoupled from the rendering camera and correctly account for gaze occlusion due to intervening geometry. We demonstrate the effectiveness of NeRGs using head pose from skeleton tracking as a proxy for gaze, employing our proposed gaze probes to aggregate noisy rays into robust probability density targets for supervision.
