Table of Contents
Fetching ...

Rethinking Precision of Pseudo Label: Test-Time Adaptation via Complementary Learning

Jiayi Han, Longbin Zeng, Liang Du, Weiyang Ding, Jianfeng Feng

TL;DR

The paper tackles test-time adaptation under distribution shifts when source data is unavailable. It introduces complementary learning that exploits complementary labels to suppress incorrect pseudo labels and aligns the learning objective with standard risk. A confidence aware extension and a dynamic thresholding memory bank enable robust negative class supervision during test-time updates. Empirical results on CIFAR-10/100 and corrupted variants CIFAR-10-C and CIFAR-100-C demonstrate state-of-the-art performance under both one-at-a-time and continual adaptation, highlighting improved robustness and practicality in real-world deployment.

Abstract

In this work, we propose a novel complementary learning approach to enhance test-time adaptation (TTA), which has been proven to exhibit good performance on testing data with distribution shifts such as corruptions. In test-time adaptation tasks, information from the source domain is typically unavailable and the model has to be optimized without supervision for test-time samples. Hence, usual methods assign labels for unannotated data with the prediction by a well-trained source model in an unsupervised learning framework. Previous studies have employed unsupervised objectives, such as the entropy of model predictions, as optimization targets to effectively learn features for test-time samples. However, the performance of the model is easily compromised by the quality of pseudo-labels, since inaccuracies in pseudo-labels introduce noise to the model. Therefore, we propose to leverage the "less probable categories" to decrease the risk of incorrect pseudo-labeling. The complementary label is introduced to designate these categories. We highlight that the risk function of complementary labels agrees with their Vanilla loss formula under the conventional true label distribution. Experiments show that the proposed learning algorithm achieves state-of-the-art performance on different datasets and experiment settings.

Rethinking Precision of Pseudo Label: Test-Time Adaptation via Complementary Learning

TL;DR

The paper tackles test-time adaptation under distribution shifts when source data is unavailable. It introduces complementary learning that exploits complementary labels to suppress incorrect pseudo labels and aligns the learning objective with standard risk. A confidence aware extension and a dynamic thresholding memory bank enable robust negative class supervision during test-time updates. Empirical results on CIFAR-10/100 and corrupted variants CIFAR-10-C and CIFAR-100-C demonstrate state-of-the-art performance under both one-at-a-time and continual adaptation, highlighting improved robustness and practicality in real-world deployment.

Abstract

In this work, we propose a novel complementary learning approach to enhance test-time adaptation (TTA), which has been proven to exhibit good performance on testing data with distribution shifts such as corruptions. In test-time adaptation tasks, information from the source domain is typically unavailable and the model has to be optimized without supervision for test-time samples. Hence, usual methods assign labels for unannotated data with the prediction by a well-trained source model in an unsupervised learning framework. Previous studies have employed unsupervised objectives, such as the entropy of model predictions, as optimization targets to effectively learn features for test-time samples. However, the performance of the model is easily compromised by the quality of pseudo-labels, since inaccuracies in pseudo-labels introduce noise to the model. Therefore, we propose to leverage the "less probable categories" to decrease the risk of incorrect pseudo-labeling. The complementary label is introduced to designate these categories. We highlight that the risk function of complementary labels agrees with their Vanilla loss formula under the conventional true label distribution. Experiments show that the proposed learning algorithm achieves state-of-the-art performance on different datasets and experiment settings.
Paper Structure (23 sections, 12 equations, 4 figures, 10 tables)

This paper contains 23 sections, 12 equations, 4 figures, 10 tables.

Figures (4)

  • Figure 1: The accuracy of the positive pseudo labels and complementary ones in testing-time adaptation. The negative ones make fewer mistakes in the prediction of correct labels. P.Label and C.Label represent the pseudo label and complementary label, respectively.
  • Figure 2: An illustration of different types of labels. The right picture is the input sample. "Prediction" represents the predicted probability distribution. Pseudo label, soft label, complementary label, and soft complementary label are generated accordingly.
  • Figure 3: A brief illustration of the proposed dynamic thresholding strategy. For each category, we find its threshold according to $Q$. We then combine the thresholds and the prediction of each sample to generate their complementary labels to train the model. After that, predictions are utilized to refresh the memory bank.
  • Figure 4: The effect of different batch sizes on ECL with dynamic thresholding.