Few-shot Personalized Scanpath Prediction
Ruoyu Xue, Jingyi Xu, Sounak Mondal, Hieu Le, Gregory Zelinsky, Minh Hoai, Dimitris Samaras
TL;DR
This work tackles the challenge of few-shot personalized scanpath prediction by decoupling subject representation learning from the scanpath predictor. It introduces SE-Net to extract robust subject embeddings from limited gaze data and uses ISP-SENet, a subject-conditioned predictor, to generate personalized scanpaths without test-time fine-tuning. Training SE-Net with triplet and contrastive losses on a base dataset enables effective generalization to unseen subjects when provided with a small support set, achieving strong performance on OSIE, COCO-FreeView, and COCO-Search18 across 1, 5, and 10-shot scenarios. The approach offers rapid adaptation (on the order of a few seconds) and demonstrates both improved accuracy and interpretability of which fixations drive subject differentiation, supporting practical deployment in eye-tracking applications.
Abstract
A personalized model for scanpath prediction provides insights into the visual preferences and attention patterns of individual subjects. However, existing methods for training scanpath prediction models are data-intensive and cannot be effectively personalized to new individuals with only a few available examples. In this paper, we propose few-shot personalized scanpath prediction task (FS-PSP) and a novel method to address it, which aims to predict scanpaths for an unseen subject using minimal support data of that subject's scanpath behavior. The key to our method's adaptability is the Subject-Embedding Network (SE-Net), specifically designed to capture unique, individualized representations for each subject's scanpaths. SE-Net generates subject embeddings that effectively distinguish between subjects while minimizing variability among scanpaths from the same individual. The personalized scanpath prediction model is then conditioned on these subject embeddings to produce accurate, personalized results. Experiments on multiple eye-tracking datasets demonstrate that our method excels in FS-PSP settings and does not require any fine-tuning steps at test time. Code is available at: https://github.com/cvlab-stonybrook/few-shot-scanpath
