Is user feedback always informative? Retrieval Latent Defending for Semi-Supervised Domain Adaptation without Source Data
Junha Song, Tae Soo Kim, Junha Kim, Gunhee Nam, Thijs Kooi, Jaegul Choo
TL;DR
The paper tackles semi-supervised domain adaptation in deployed environments where user feedback provides a small labeled target set but is biased toward misclassified samples (Negatively Biased Feedback, NBF). It reveals that NBF can degrade adaptation when plugged into existing SemiSDA methods and introduces Retrieval Latent Defending (RLD), a plug-in strategy that balances the supervised signal by appending defending samples retrieved from a latent bank of pseudo-labeled data. RLD constructs a candidate bank by labeling unlabeled target data with top-probability pseudo-labels and selects defending samples per labeled instance to maintain balanced class discriminability, introducing the overall loss $\mathcal{L}_{total} = \mathcal{L}_{sup} + \mathcal{L}_{unsup} + \frac{1}{k \cdot B} \sum_{b=1}^{k \cdot B} \mathcal{H}(\hat{y}_{LD}^{b}, f_{\theta}(x_{LD}^{b}))$. Across natural image benchmarks (DomainNet-126, OfficeHome) and a real-world medical imaging task (MIMIC-CXR-V2), integrating RLD with multiple SemiSDA/SemiSL baselines yields consistent improvements, demonstrating a scalable, practical approach to robust adaptation under user feedback biases.
Abstract
This paper aims to adapt the source model to the target environment, leveraging small user feedback (i.e., labeled target data) readily available in real-world applications. We find that existing semi-supervised domain adaptation (SemiSDA) methods often suffer from poorly improved adaptation performance when directly utilizing such feedback data, as shown in Figure 1. We analyze this phenomenon via a novel concept called Negatively Biased Feedback (NBF), which stems from the observation that user feedback is more likely for data points where the model produces incorrect predictions. To leverage this feedback while avoiding the issue, we propose a scalable adapting approach, Retrieval Latent Defending. This approach helps existing SemiSDA methods to adapt the model with a balanced supervised signal by utilizing latent defending samples throughout the adaptation process. We demonstrate the problem caused by NBF and the efficacy of our approach across various benchmarks, including image classification, semantic segmentation, and a real-world medical imaging application. Our extensive experiments reveal that integrating our approach with multiple state-of-the-art SemiSDA methods leads to significant performance improvements.
