Probabilistic Test-Time Generalization by Variational Neighbor-Labeling
Sameer Ambekar, Zehao Xiao, Jiayi Shen, Xiantong Zhen, Cees G. M. Snoek
TL;DR
This work tackles domain generalization under test-time shifts by leveraging unlabeled target data at inference. It introduces a probabilistic framework that treats pseudo labels as latent distributions and augments them with variational neighbor labels, coupled with a meta-generalization training loop to simulate target-domain adaptation. The key contributions are probabilistic pseudo-labeling, variational neighbor labels, and the meta-generalization stage, collectively enabling more robust and calibrated generalization to unseen targets. Across seven domain-generalization benchmarks, the method achieves competitive or state-of-the-art results, with notable gains in calibration and in challenging or small-target-batch scenarios, demonstrating practical effectiveness for test-time domain generalization.
Abstract
This paper strives for domain generalization, where models are trained exclusively on source domains before being deployed on unseen target domains. We follow the strict separation of source training and target testing, but exploit the value of the unlabeled target data itself during inference. We make three contributions. First, we propose probabilistic pseudo-labeling of target samples to generalize the source-trained model to the target domain at test time. We formulate the generalization at test time as a variational inference problem, by modeling pseudo labels as distributions, to consider the uncertainty during generalization and alleviate the misleading signal of inaccurate pseudo labels. Second, we learn variational neighbor labels that incorporate the information of neighboring target samples to generate more robust pseudo labels. Third, to learn the ability to incorporate more representative target information and generate more precise and robust variational neighbor labels, we introduce a meta-generalization stage during training to simulate the generalization procedure. Experiments on seven widely-used datasets demonstrate the benefits, abilities, and effectiveness of our proposal.
