Table of Contents
Fetching ...

MedBN: Robust Test-Time Adaptation against Malicious Test Samples

Hyejin Park, Jeongyeon Hwang, Sunung Mun, Sangdon Park, Jungseul Ok

TL;DR

Test-time adaptation methods are vulnerable to data-poisoning attacks that manipulate batch statistics. The authors propose Median Batch Normalization (MedBN), which uses the median to estimate BN statistics during test-time, replacing the conventional mean-based approach and enabling plug-in compatibility with existing TTA frameworks. They provide theoretical justification that mean-based statistics are easily corrupted by a single malicious sample, while the median remains robust unless malicious samples constitute a majority; empirical results on CIFAR10-C, CIFAR100-C, and ImageNet-C (and semantic segmentation tasks) show MedBN substantially improves resilience to both targeted and indiscriminate attacks with minimal loss on benign performance. Overall, MedBN offers a practical, architecture-agnostic defense that strengthens TTA against data-poisoning while preserving performance in benign conditions, making it suitable for real-world deployment.

Abstract

Test-time adaptation (TTA) has emerged as a promising solution to address performance decay due to unforeseen distribution shifts between training and test data. While recent TTA methods excel in adapting to test data variations, such adaptability exposes a model to vulnerability against malicious examples, an aspect that has received limited attention. Previous studies have uncovered security vulnerabilities within TTA even when a small proportion of the test batch is maliciously manipulated. In response to the emerging threat, we propose median batch normalization (MedBN), leveraging the robustness of the median for statistics estimation within the batch normalization layer during test-time inference. Our method is algorithm-agnostic, thus allowing seamless integration with existing TTA frameworks. Our experimental results on benchmark datasets, including CIFAR10-C, CIFAR100-C and ImageNet-C, consistently demonstrate that MedBN outperforms existing approaches in maintaining robust performance across different attack scenarios, encompassing both instant and cumulative attacks. Through extensive experiments, we show that our approach sustains the performance even in the absence of attacks, achieving a practical balance between robustness and performance.

MedBN: Robust Test-Time Adaptation against Malicious Test Samples

TL;DR

Test-time adaptation methods are vulnerable to data-poisoning attacks that manipulate batch statistics. The authors propose Median Batch Normalization (MedBN), which uses the median to estimate BN statistics during test-time, replacing the conventional mean-based approach and enabling plug-in compatibility with existing TTA frameworks. They provide theoretical justification that mean-based statistics are easily corrupted by a single malicious sample, while the median remains robust unless malicious samples constitute a majority; empirical results on CIFAR10-C, CIFAR100-C, and ImageNet-C (and semantic segmentation tasks) show MedBN substantially improves resilience to both targeted and indiscriminate attacks with minimal loss on benign performance. Overall, MedBN offers a practical, architecture-agnostic defense that strengthens TTA against data-poisoning while preserving performance in benign conditions, making it suitable for real-world deployment.

Abstract

Test-time adaptation (TTA) has emerged as a promising solution to address performance decay due to unforeseen distribution shifts between training and test data. While recent TTA methods excel in adapting to test data variations, such adaptability exposes a model to vulnerability against malicious examples, an aspect that has received limited attention. Previous studies have uncovered security vulnerabilities within TTA even when a small proportion of the test batch is maliciously manipulated. In response to the emerging threat, we propose median batch normalization (MedBN), leveraging the robustness of the median for statistics estimation within the batch normalization layer during test-time inference. Our method is algorithm-agnostic, thus allowing seamless integration with existing TTA frameworks. Our experimental results on benchmark datasets, including CIFAR10-C, CIFAR100-C and ImageNet-C, consistently demonstrate that MedBN outperforms existing approaches in maintaining robust performance across different attack scenarios, encompassing both instant and cumulative attacks. Through extensive experiments, we show that our approach sustains the performance even in the absence of attacks, achieving a practical balance between robustness and performance.
Paper Structure (30 sections, 2 theorems, 16 equations, 12 figures, 19 tables, 1 algorithm)

This paper contains 30 sections, 2 theorems, 16 equations, 12 figures, 19 tables, 1 algorithm.

Key Result

Theorem 1

Consider a set of $n$ numbers $\mathcal{B} = \{x_i \in \mathbb{R}: i \in [n]\}$ and $1 \le m \le n$ where the first $m$ numbers are possibly manipulated by adversaries. Let $\mathcal{B}_{\textnormal{mal}} = \{x_i : i \in [m]\}$, and $\mathcal{B}_{\textnormal{ben}} = \mathcal{B} \setminus \mathcal{B} (ii) The median is robust against malicious samples unless they are not the majority, i.e., for any

Figures (12)

  • Figure 1: An illustrative example of the vulnerability of mean in a batch normalization layer to manipulation by malicious sample (left), contrasted with the robustness of median such manipulation (right), when dealing with malicious samples within the batch.
  • Figure 2: An overview of MedBN. (Top) TTA methods adapted with BN layers normalize the features ($z$) by estimating normalization statistics $\hat{\mu}$ and $\hat{\sigma}^{2}$, and optimize transformation parameters $\gamma$ and $\beta$. (Bottom) In contrast to conventional BN, which computes the statistics based on the mean of inputs, our proposed MedBN utilizes the median value for estimating the statistics, $\hat{\mu}$ and $\hat{\sigma}^{2}$.
  • Figure 3: Analysis of vulnerability of existing TTA Methods against attacks. Figure \ref{['fig:vul:target']} and Figure \ref{['fig:vul:indis']} represent the relation between entropy and gradient norm of benign and malicious samples in targeted attack and indiscriminate attack, respectively. Figure \ref{['fig:vul:filter']} illustrates the proportion of malicious samples $\mathcal{B}_{mal}$ among the total remaining samples after filtering over the type of corruption, considering an initial condition where 20% of the samples in the batch were malicious. All experiments are performed on CIFAR10-C dataset with Gaussian noise, using a ResNet26, at the highest severity of distribution shift, i.e., level 5.
  • Figure 4: t-SNE visualizations in representative layers of BN and MedBN, across ResNet26 blocks, with benign samples (blue dots) and malicious samples (red crosses).
  • Figure 5: L1 distance for measuring the amount of perturbation by malicious samples.
  • ...and 7 more figures

Theorems & Definitions (2)

  • Theorem 1
  • Theorem 2: Extension of Theorem \ref{['thm:1']}