Table of Contents
Fetching ...

STAMP: Outlier-Aware Test-Time Adaptation with Stable Memory Replay

Yongcan Yu, Lijun Sheng, Ran He, Jian Liang

TL;DR

A new approach called STAble Memory rePlay (STAMP), which performs optimization over a stable memory bank instead of the risky mini-batch, and develops a self-weighted entropy minimization strategy that assigns higher weight to low-entropy samples.

Abstract

Test-time adaptation (TTA) aims to address the distribution shift between the training and test data with only unlabeled data at test time. Existing TTA methods often focus on improving recognition performance specifically for test data associated with classes in the training set. However, during the open-world inference process, there are inevitably test data instances from unknown classes, commonly referred to as outliers. This paper pays attention to the problem that conducts both sample recognition and outlier rejection during inference while outliers exist. To address this problem, we propose a new approach called STAble Memory rePlay (STAMP), which performs optimization over a stable memory bank instead of the risky mini-batch. In particular, the memory bank is dynamically updated by selecting low-entropy and label-consistent samples in a class-balanced manner. In addition, we develop a self-weighted entropy minimization strategy that assigns higher weight to low-entropy samples. Extensive results demonstrate that STAMP outperforms existing TTA methods in terms of both recognition and outlier detection performance. The code is released at https://github.com/yuyongcan/STAMP.

STAMP: Outlier-Aware Test-Time Adaptation with Stable Memory Replay

TL;DR

A new approach called STAble Memory rePlay (STAMP), which performs optimization over a stable memory bank instead of the risky mini-batch, and develops a self-weighted entropy minimization strategy that assigns higher weight to low-entropy samples.

Abstract

Test-time adaptation (TTA) aims to address the distribution shift between the training and test data with only unlabeled data at test time. Existing TTA methods often focus on improving recognition performance specifically for test data associated with classes in the training set. However, during the open-world inference process, there are inevitably test data instances from unknown classes, commonly referred to as outliers. This paper pays attention to the problem that conducts both sample recognition and outlier rejection during inference while outliers exist. To address this problem, we propose a new approach called STAble Memory rePlay (STAMP), which performs optimization over a stable memory bank instead of the risky mini-batch. In particular, the memory bank is dynamically updated by selecting low-entropy and label-consistent samples in a class-balanced manner. In addition, we develop a self-weighted entropy minimization strategy that assigns higher weight to low-entropy samples. Extensive results demonstrate that STAMP outperforms existing TTA methods in terms of both recognition and outlier detection performance. The code is released at https://github.com/yuyongcan/STAMP.
Paper Structure (17 sections, 9 equations, 4 figures, 6 tables, 2 algorithms)

This paper contains 17 sections, 9 equations, 4 figures, 6 tables, 2 algorithms.

Figures (4)

  • Figure 1: The illustration of outlier-aware test-time adaptation. The test data stream consists of both normal samples and outliers. When the test data arrives, the algorithm adapts the model and provides two outputs, a prediction and an OOD score. OOD score indicates the likelihood that a sample is an outlier. The detector in the deployment scenario chooses to keep the prediction or reject samples according to the OOD score.
  • Figure 2: The overview of STAMP. STAMP retains reliable samples for optimizing the network by maintaining a memory bank. When test samples arrive, STAMP generates an augmentation-averaged prediction. Two filtering mechanisms are designed to filter out inconsistent and high-entropy samples and add the remaining to the memory. A frequency vector is maintained to select samples to discard when the memory is full. STAMP leverages the reliable samples in the memory to optimize the model with the SAM optimizer alongside a decaying step strategy.
  • Figure 3: Results for the different weight strategies on dataset CIFAR100-C with SVHN-C outlier dataset. Static weight represents that the $w_i$ in \ref{['equ:loss']} without gradient back-propagation. EATA weight represents the weight for reliable samples used in EATA niu2022efficient.
  • Figure 4: The illustration of the effectiveness of augmentation-averaged robust estimate on CIFAR10-C. $+ Aug$ means augmenting the data 16 times and averaging their results.