Table of Contents
Fetching ...

Bi-TTA: Bidirectional Test-Time Adapter for Remote Physiological Measurement

Haodong Li, Hao Lu, Ying-Cong Chen

TL;DR

The Test-Time Adaptation (TTA) is pioneered in rPPG, enabling the adaptation of pre-trained models to the target domain during inference, sidestepping the need for annotations or source data due to privacy considerations, and establishing a large-scale benchmark for rPPG tasks under TTA protocol.

Abstract

Remote photoplethysmography (rPPG) is gaining prominence for its non-invasive approach to monitoring physiological signals using only cameras. Despite its promise, the adaptability of rPPG models to new, unseen domains is hindered due to the environmental sensitivity of physiological signals. To address this, we pioneer the Test-Time Adaptation (TTA) in rPPG, enabling the adaptation of pre-trained models to the target domain during inference, sidestepping the need for annotations or source data due to privacy considerations. Particularly, utilizing only the user's face video stream as the accessible target domain data, the rPPG model is adjusted by tuning on each single instance it encounters. However, 1) TTA algorithms are designed predominantly for classification tasks, ill-suited in regression tasks such as rPPG due to inadequate supervision. 2) Tuning pre-trained models in a single-instance manner introduces variability and instability, posing challenges to effectively filtering domain-relevant from domain-irrelevant features while simultaneously preserving the learned information. To overcome these challenges, we present Bi-TTA, a novel expert knowledge-based Bidirectional Test-Time Adapter framework. Specifically, leveraging two expert-knowledge priors for providing self-supervision, our Bi-TTA primarily comprises two modules: a prospective adaptation (PA) module using sharpness-aware minimization to eliminate domain-irrelevant noise, enhancing the stability and efficacy during the adaptation process, and a retrospective stabilization (RS) module to dynamically reinforce crucial learned model parameters, averting performance degradation caused by overfitting or catastrophic forgetting. To this end, we established a large-scale benchmark for rPPG tasks under TTA protocol. The experimental results demonstrate the significant superiority of our approach over the state-of-the-art.

Bi-TTA: Bidirectional Test-Time Adapter for Remote Physiological Measurement

TL;DR

The Test-Time Adaptation (TTA) is pioneered in rPPG, enabling the adaptation of pre-trained models to the target domain during inference, sidestepping the need for annotations or source data due to privacy considerations, and establishing a large-scale benchmark for rPPG tasks under TTA protocol.

Abstract

Remote photoplethysmography (rPPG) is gaining prominence for its non-invasive approach to monitoring physiological signals using only cameras. Despite its promise, the adaptability of rPPG models to new, unseen domains is hindered due to the environmental sensitivity of physiological signals. To address this, we pioneer the Test-Time Adaptation (TTA) in rPPG, enabling the adaptation of pre-trained models to the target domain during inference, sidestepping the need for annotations or source data due to privacy considerations. Particularly, utilizing only the user's face video stream as the accessible target domain data, the rPPG model is adjusted by tuning on each single instance it encounters. However, 1) TTA algorithms are designed predominantly for classification tasks, ill-suited in regression tasks such as rPPG due to inadequate supervision. 2) Tuning pre-trained models in a single-instance manner introduces variability and instability, posing challenges to effectively filtering domain-relevant from domain-irrelevant features while simultaneously preserving the learned information. To overcome these challenges, we present Bi-TTA, a novel expert knowledge-based Bidirectional Test-Time Adapter framework. Specifically, leveraging two expert-knowledge priors for providing self-supervision, our Bi-TTA primarily comprises two modules: a prospective adaptation (PA) module using sharpness-aware minimization to eliminate domain-irrelevant noise, enhancing the stability and efficacy during the adaptation process, and a retrospective stabilization (RS) module to dynamically reinforce crucial learned model parameters, averting performance degradation caused by overfitting or catastrophic forgetting. To this end, we established a large-scale benchmark for rPPG tasks under TTA protocol. The experimental results demonstrate the significant superiority of our approach over the state-of-the-art.
Paper Structure (15 sections, 9 equations, 6 figures, 1 table)

This paper contains 15 sections, 9 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Visualization of the mean absolute error (MAE) result on down-sampled VIPL dataset$^{\ref{['fn:vipl_20']}}$ using (a) the pre-trained rPPG model; (b) priors-based test-time adapted model; and (c) priors-based bidirectionally test-time adapted model. These visualizations demonstrate that our proposed priors and the Bi-TTA framework significantly enhance the generalization performance (reflected by the flatness of MAE field li2018visualizing) beyond the pre-trained model$^{\ref{['fn:contour']}}$.
  • Figure 2: Visualization of Domain Adaptation (DA) and Test-Time Adaptation (TTA) methodologies. DA utilizes batch learning with labeled target data, while TTA dynamically refines the model during inference without relying on target data labels or distribution. Both DA and TTA obviate the requirement for source data. Note that domain generalization (DG) is excluded as it does not focus on specific target domain adaptation.
  • Figure 3: Illustration of STMap construction and the implementation of our proposed expert knowledge-based priors. (a) The process of generating STMap, encompassing face alignment$^{\ref{['fn:face']}}$ and cropping, local signal extraction, and the subsequent integration. (b) The calculation process of TCL, aimed at minimizing significant prediction discrepancies between original and temporally shifted HR predictions. (c) The calculation process the SCL, focused on penalizing pronounced disparities across different facial regions. Note that the boxes colored in pink represent the loss outcomes.
  • Figure 4: Illustration of the proposed Bidirectional Test-Time Adapter (Bi-TTA). Black arrows $\rightarrow$ indicate the adaptation process purely with the proposed two priors, i.e., TCL and SCL. Orange arrows $\rightarrow$ denote that the PA module adjusts model parameter using the gradient of representative neighborhood with a radius $\rho$. The green ones $\rightarrow$ show that the RS is activated when there is an oscillation, which is a sign of performance degradation, for maintaining the essential learned adaptation ability with former tuning gradients. The gradient and learning rate are formulated as $\boldsymbol{g}$ and $\eta$ respectively.
  • Figure 5: Ablation experiments of hyper-parameters on down-sampled VIPL dataset$^{\ref{['fn:vipl_20_interval']}}$. Adheres to the default parameter configuration detailed in Sec.\ref{['sec:implement']}, only one hyper-parameter is varied during each set of experiments. The pink arrows point to the lowest MAE result and the blue arrows mark the second if indiscernible.
  • ...and 1 more figures