Table of Contents
Fetching ...

Test-Time Personalization with Meta Prompt for Gaze Estimation

Huan Liu, Julia Qi, Zhenhao Li, Mohammad Hassanpour, Yang Wang, Konstantinos Plataniotis, Yuanhao Yu

TL;DR

This work takes inspiration from the recent advances in Natural Language Processing by updating a negligible number of parameters, "prompts", at the test time by meta-learning the prompt to ensure that its updates align with the goal.

Abstract

Despite the recent remarkable achievement in gaze estimation, efficient and accurate personalization of gaze estimation without labels is a practical problem but rarely touched on in the literature. To achieve efficient personalization, we take inspiration from the recent advances in Natural Language Processing (NLP) by updating a negligible number of parameters, "prompts", at the test time. Specifically, the prompt is additionally attached without perturbing original network and can contain less than 1% of a ResNet-18's parameters. Our experiments show high efficiency of the prompt tuning approach. The proposed one can be 10 times faster in terms of adaptation speed than the methods compared. However, it is non-trivial to update the prompt for personalized gaze estimation without labels. At the test time, it is essential to ensure that the minimizing of particular unsupervised loss leads to the goals of minimizing gaze estimation error. To address this difficulty, we propose to meta-learn the prompt to ensure that its updates align with the goal. Our experiments show that the meta-learned prompt can be effectively adapted even with a simple symmetry loss. In addition, we experiment on four cross-dataset validations to show the remarkable advantages of the proposed method. Code is available at https://github.com/hmarkamcan/TPGaze.

Test-Time Personalization with Meta Prompt for Gaze Estimation

TL;DR

This work takes inspiration from the recent advances in Natural Language Processing by updating a negligible number of parameters, "prompts", at the test time by meta-learning the prompt to ensure that its updates align with the goal.

Abstract

Despite the recent remarkable achievement in gaze estimation, efficient and accurate personalization of gaze estimation without labels is a practical problem but rarely touched on in the literature. To achieve efficient personalization, we take inspiration from the recent advances in Natural Language Processing (NLP) by updating a negligible number of parameters, "prompts", at the test time. Specifically, the prompt is additionally attached without perturbing original network and can contain less than 1% of a ResNet-18's parameters. Our experiments show high efficiency of the prompt tuning approach. The proposed one can be 10 times faster in terms of adaptation speed than the methods compared. However, it is non-trivial to update the prompt for personalized gaze estimation without labels. At the test time, it is essential to ensure that the minimizing of particular unsupervised loss leads to the goals of minimizing gaze estimation error. To address this difficulty, we propose to meta-learn the prompt to ensure that its updates align with the goal. Our experiments show that the meta-learned prompt can be effectively adapted even with a simple symmetry loss. In addition, we experiment on four cross-dataset validations to show the remarkable advantages of the proposed method. Code is available at https://github.com/hmarkamcan/TPGaze.
Paper Structure (29 sections, 8 equations, 6 figures, 5 tables, 1 algorithm)

This paper contains 29 sections, 8 equations, 6 figures, 5 tables, 1 algorithm.

Figures (6)

  • Figure 1: Illustration of performance comparison in terms of gaze error and tunable parameters. Our method achieved the lowest gaze error with negligible tunable parameters required for personalization. Gaze errors are calculated by averaging four cross-dataset validations.
  • Figure 2: Illustration of the difference between four representative datasets. They are different from each other in subject appearance, image quality and lighting conditions.
  • Figure 3: Illustration of replacing padding by tunable prompt.
  • Figure 4: An overview of the proposed test-time personalization on $j$-th subject. In personalization, all the parameters are fixed except for the prompt.
  • Figure 5: Distribution of gaze estimation results and ground-truth labels before and after personalization. Results are the personalization from $\mathcal{D}_G$ to $\mathcal{D}_D$.
  • ...and 1 more figures