Improving Domain Generalization on Gaze Estimation via Branch-out Auxiliary Regularization

Ruijie Zhao; Pinyan Tang; Sihui Luo

Improving Domain Generalization on Gaze Estimation via Branch-out Auxiliary Regularization

Ruijie Zhao, Pinyan Tang, Sihui Luo

TL;DR

This work tackles domain generalization in appearance-based gaze estimation under uncontrolled conditions where illumination and identity variations degrade performance. It introduces Branch-out Auxiliary Regularization (BAR), a plug-and-play training-time framework that adds two auxiliary branches—the augmentation branch and the contrast branch—to enforce consistency and disentangle gaze-relevant features from gaze-irrelevant attributes without using target-domain data. BAR integrates with existing models via a multi-term loss, $\mathcal{L}_{total} = \mathcal{L}_{ori} + \lambda_{a}\mathcal{L}_{aug} + \lambda_{m}\mathcal{L}_{mmd} + \lambda_{c}\mathcal{L}_{con}$, with all $\lambda$ set to 1.0, and employs $\mathcal{L}_{aug}$, $\mathcal{L}_{mmd}$, and $\mathcal{L}_{con}$ to enhance invariance to environmental and identity factors. Experiments on four cross-dataset tasks demonstrate that BAR consistently surpasses baselines and state-of-the-art methods, and its plug-and-play design allows easy adoption across diverse gaze-estimation architectures, enabling more robust real-world gaze systems.

Abstract

Despite remarkable advancements, mainstream gaze estimation techniques, particularly appearance-based methods, often suffer from performance degradation in uncontrolled environments due to variations in illumination and individual facial attributes. Existing domain adaptation strategies, limited by their need for target domain samples, may fall short in real-world applications. This letter introduces Branch-out Auxiliary Regularization (BAR), an innovative method designed to boost gaze estimation's generalization capabilities without requiring direct access to target domain data. Specifically, BAR integrates two auxiliary consistency regularization branches: one that uses augmented samples to counteract environmental variations, and another that aligns gaze directions with positive source domain samples to encourage the learning of consistent gaze features. These auxiliary pathways strengthen the core network and are integrated in a smooth, plug-and-play manner, facilitating easy adaptation to various other models. Comprehensive experimental evaluations on four cross-dataset tasks demonstrate the superiority of our approach.

Improving Domain Generalization on Gaze Estimation via Branch-out Auxiliary Regularization

TL;DR

, with all

set to 1.0, and employs

, and

to enhance invariance to environmental and identity factors. Experiments on four cross-dataset tasks demonstrate that BAR consistently surpasses baselines and state-of-the-art methods, and its plug-and-play design allows easy adoption across diverse gaze-estimation architectures, enabling more robust real-world gaze systems.

Abstract

Paper Structure (18 sections, 10 equations, 3 figures, 3 tables)

This paper contains 18 sections, 10 equations, 3 figures, 3 tables.

Introduction
Domain Generalization for Gaze Estimation
Preliminaries
Original Branch
Augmentation Branch
Contrast Branch
Overall Loss Function
Experiment
Data Preparation and Evaluation Metric
Implementation Details
Comparison Methods
Baseline
State-of-the-art methods
Ablation Studies
State-of-the-Art Comparison
...and 3 more sections

Figures (3)

Figure 1: Comparison of our method with conventional UDA and DG methods for gaze estimation. UDA methods generally rely on target domain data to enable knowledge transfer from the source domain to the target domain. Conventional DG methods leverage uncontrolled adversarial learning techniques that may cause feature elimination problems. Our method, in contrast, requires no access to target domain data and leverages flexible auxiliary consistency regularization branches to enhance model generalization.
Figure 2: The overall framework of our approach. We extend the original gaze estimation network by integrating two auxiliary consistency regularization pathways in a plug-and-play manner. Notably, the auxiliary branches are exclusively employed during the training phase and do not influence the test phase.
Figure 3: t-SNE visualization of the extracted features. Similar colors represent similar gaze directions. Each dot denotes a sample from the test set.

Improving Domain Generalization on Gaze Estimation via Branch-out Auxiliary Regularization

TL;DR

Abstract

Improving Domain Generalization on Gaze Estimation via Branch-out Auxiliary Regularization

Authors

TL;DR

Abstract

Table of Contents

Figures (3)