Table of Contents
Fetching ...

Class-Conditioned Transformation for Enhanced Robust Image Classification

Tsachi Blau, Roy Ganz, Chaim Baskin, Michael Elad, Alex M. Bronstein

TL;DR

This work proposes a novel test-time threat model agnostic algorithm that enhances Adversarial-Trained (AT) models and allows users to choose the desired balance between clean and robust accuracy without training.

Abstract

Robust classification methods predominantly concentrate on algorithms that address a specific threat model, resulting in ineffective defenses against other threat models. Real-world applications are exposed to this vulnerability, as malicious attackers might exploit alternative threat models. In this work, we propose a novel test-time threat model agnostic algorithm that enhances Adversarial-Trained (AT) models. Our method operates through COnditional image transformation and DIstance-based Prediction (CODIP) and includes two main steps: First, we transform the input image into each dataset class, where the input image might be either clean or attacked. Next, we make a prediction based on the shortest transformed distance. The conditional transformation utilizes the perceptually aligned gradients property possessed by AT models and, as a result, eliminates the need for additional models or additional training. Moreover, it allows users to choose the desired balance between clean and robust accuracy without training. The proposed method achieves state-of-the-art results demonstrated through extensive experiments on various models, AT methods, datasets, and attack types. Notably, applying CODIP leads to substantial robust accuracy improvement of up to $+23\%$, $+20\%$, $+26\%$, and $+22\%$ on CIFAR10, CIFAR100, ImageNet and Flowers datasets, respectively.

Class-Conditioned Transformation for Enhanced Robust Image Classification

TL;DR

This work proposes a novel test-time threat model agnostic algorithm that enhances Adversarial-Trained (AT) models and allows users to choose the desired balance between clean and robust accuracy without training.

Abstract

Robust classification methods predominantly concentrate on algorithms that address a specific threat model, resulting in ineffective defenses against other threat models. Real-world applications are exposed to this vulnerability, as malicious attackers might exploit alternative threat models. In this work, we propose a novel test-time threat model agnostic algorithm that enhances Adversarial-Trained (AT) models. Our method operates through COnditional image transformation and DIstance-based Prediction (CODIP) and includes two main steps: First, we transform the input image into each dataset class, where the input image might be either clean or attacked. Next, we make a prediction based on the shortest transformed distance. The conditional transformation utilizes the perceptually aligned gradients property possessed by AT models and, as a result, eliminates the need for additional models or additional training. Moreover, it allows users to choose the desired balance between clean and robust accuracy without training. The proposed method achieves state-of-the-art results demonstrated through extensive experiments on various models, AT methods, datasets, and attack types. Notably, applying CODIP leads to substantial robust accuracy improvement of up to , , , and on CIFAR10, CIFAR100, ImageNet and Flowers datasets, respectively.
Paper Structure (21 sections, 4 figures, 4 tables)

This paper contains 21 sections, 4 figures, 4 tables.

Figures (4)

  • Figure 1: An Overview of CODIP At first, the input image (clean or attacked) is class conditioned transformed through $\{\text{T}(\cdot|1), \dots,\text{T}(\cdot|N)\}$ to each one of the dataset classes, creating images $\{\text{Image}_1, \dots, \text{Image}_N\}$. Next, the $\ell_2$ distance is calculated between the input image and the transformed images $\{\text{Image}_1, \dots, \text{Image}_N\}$, and prediction is made based on the shortest distance.
  • Figure 2: A Comparison Between the Classifier and CODIP Decision Rules The background color of the image describes the classifier's classification rules, and the intensity describes the classifier's certainty. The clean image (green dot) is attacked (red dot), leading to a wrong classification. In contrast, CODIP predicts based on the shortest transformation. It operates in two steps: First, it class conditioned transform (dotted arrow) the attacked image towards each one of the datasets's classes (blue dots). Next, prediction is made based on the shortest distance between the attacked image and the transformed images, highlighted by a red dotted circle.
  • Figure 3: Impact of $\gamma$ on Clean-Robust Accuracy Trade-off We present three $\alpha$ working points on the ImageNet dataset using an AT model $L_2,\epsilon = 3.0$.
  • Figure 4: Clean-Robust Accuracy Trade-off A demonstration of our proposed controlled clean-robust accuracy tradeoff. The tradeoff is controlled by adapting the step size value $\alpha$, specified beside each of CODIP workpoints. The used test-time methods: 'Base' which is the base AT model, TTE perez2021enhancing, DRQ schwinn2022improving, and CODIP.