Webcam-based Pupil Diameter Prediction Benefits from Upscaling
Vijul Shah, Brian B. Moser, Ko Watanabe, Andreas Dengel
TL;DR
Low-resolution webcam eye images hinder precise pupil diameter estimation for cognitive and physiological state assessment. The study evaluates five pre-trained SR models as preprocessing on full-face images to produce EyeDentify++ left/right eye datasets and trains three ResNet regressors on upscaled images at 2× and 4×. Findings indicate that SR generally improves prediction accuracy with strong interactions between SR method and scale; while bicubic upsampling often performs well, several advanced SR models yield further gains and induce shifts in model attention observed via activation maps. These results provide practical guidance for selecting upscaling techniques to boost webcam-based pupilometry, enabling more reliable assessments of stress, cognitive load, and related states in real-world settings.
Abstract
Capturing pupil diameter is essential for assessing psychological and physiological states such as stress levels and cognitive load. However, the low resolution of images in eye datasets often hampers precise measurement. This study evaluates the impact of various upscaling methods, ranging from bicubic interpolation to advanced super-resolution, on pupil diameter predictions. We compare several pre-trained methods, including CodeFormer, GFPGAN, Real-ESRGAN, HAT, and SRResNet. Our findings suggest that pupil diameter prediction models trained on upscaled datasets are highly sensitive to the selected upscaling method and scale. Our results demonstrate that upscaling methods consistently enhance the accuracy of pupil diameter prediction models, highlighting the importance of upscaling in pupilometry. Overall, our work provides valuable insights for selecting upscaling techniques, paving the way for more accurate assessments in psychological and physiological research.
