A Random Ensemble of Encrypted Vision Transformers for Adversarially Robust Defense

Ryota Iijima; Sayaka Shiota; Hitoshi Kiya

A Random Ensemble of Encrypted Vision Transformers for Adversarially Robust Defense

Ryota Iijima, Sayaka Shiota, Hitoshi Kiya

TL;DR

The paper tackles adversarial robustness of vision models by combining secret-key encryption with a random ensemble of ViTs. By training N encrypted sub-models on data encrypted with distinct keys and operating a random ensemble that aggregates S randomly chosen outputs, the approach defends against both white-box and black-box attacks while preserving high clean accuracy. Evaluations on CIFAR-10 and ImageNet using AutoAttack and RobustBench indicate superior performance compared to state-of-the-art defenses, highlighting the practical impact of key-based, ensemble methods in adversarial settings. This methodology offers a scalable defense that leverages ViT properties and block-wise encryption to curb transferability of adversarial perturbations and enhance security in real-world deployments.

Abstract

Deep neural networks (DNNs) are well known to be vulnerable to adversarial examples (AEs). In previous studies, the use of models encrypted with a secret key was demonstrated to be robust against white-box attacks, but not against black-box ones. In this paper, we propose a novel method using the vision transformer (ViT) that is a random ensemble of encrypted models for enhancing robustness against both white-box and black-box attacks. In addition, a benchmark attack method, called AutoAttack, is applied to models to test adversarial robustness objectively. In experiments, the method was demonstrated to be robust against not only white-box attacks but also black-box ones in an image classification task on the CIFAR-10 and ImageNet datasets. The method was also compared with the state-of-the-art in a standardized benchmark for adversarial robustness, RobustBench, and it was verified to outperform conventional defenses in terms of clean accuracy and robust accuracy.

A Random Ensemble of Encrypted Vision Transformers for Adversarially Robust Defense

TL;DR

Abstract

A Random Ensemble of Encrypted Vision Transformers for Adversarially Robust Defense

Authors

TL;DR

Abstract

Table of Contents

Figures (1)