Bridge then Begin Anew: Generating Target-relevant Intermediate Model for Source-free Visual Emotion Adaptation
Jiankun Zhu, Sicheng Zhao, Jing Jiang, Wenbo Tang, Zhaopan Xu, Tingting Han, Pengfei Xu, Hongxun Yao
TL;DR
This work tackles source-free domain adaptation for visual emotion recognition (SFDA-VER), addressing privacy constraints by not accessing source data during adaptation. It introduces Bridge then Begin Anew (BBA), a two-stage framework: Domain-bridged Model Generation (DMG) generates a bridge model to yield reliable pseudo-labels, and Target-related Model Adaptation (TMA) trains a target model from scratch under guidance from the bridge, augmented with masking, clustering, and emotion polarity losses. The approach yields substantial improvements over state-of-the-art SFDA methods and competes with or exceeds several unsupervised domain adaptation baselines across six VER settings, demonstrating robust cross-domain transfer under large affective gaps. Overall, BBA enables privacy-preserving VER deployment by reducing dependence on source data while maintaining strong performance and encouraging target-domain feature discovery.
Abstract
Visual emotion recognition (VER), which aims at understanding humans' emotional reactions toward different visual stimuli, has attracted increasing attention. Given the subjective and ambiguous characteristics of emotion, annotating a reliable large-scale dataset is hard. For reducing reliance on data labeling, domain adaptation offers an alternative solution by adapting models trained on labeled source data to unlabeled target data. Conventional domain adaptation methods require access to source data. However, due to privacy concerns, source emotional data may be inaccessible. To address this issue, we propose an unexplored task: source-free domain adaptation (SFDA) for VER, which does not have access to source data during the adaptation process. To achieve this, we propose a novel framework termed Bridge then Begin Anew (BBA), which consists of two steps: domain-bridged model generation (DMG) and target-related model adaptation (TMA). First, the DMG bridges cross-domain gaps by generating an intermediate model, avoiding direct alignment between two VER datasets with significant differences. Then, the TMA begins training the target model anew to fit the target structure, avoiding the influence of source-specific knowledge. Extensive experiments are conducted on six SFDA settings for VER. The results demonstrate the effectiveness of BBA, which achieves remarkable performance gains compared with state-of-the-art SFDA methods and outperforms representative unsupervised domain adaptation approaches.
