Universal Adversarial Defense in Remote Sensing Based on Pre-trained Denoising Diffusion Models

Weikang Yu; Yonghao Xu; Pedram Ghamisi

Universal Adversarial Defense in Remote Sensing Based on Pre-trained Denoising Diffusion Models

Weikang Yu, Yonghao Xu, Pedram Ghamisi

TL;DR

This work addresses the vulnerability of remote sensing DNNs to universal adversarial perturbations by introducing UAD-RS, a defense that leverages a single pre-trained denoising diffusion probabilistic model to purify adversarial samples across heterogeneous attacks. The approach diffuses adversarial inputs with Gaussian noise and then denoises them, with an Adaptive Noise Level Selection (ANLS) mechanism guiding the optimal diffusion level via a task-guided Fréchet Inception Distance (FID) ranking. Comprehensive experiments on four RS datasets for scene classification and semantic segmentation demonstrate that UAD-RS outperforms state-of-the-art purification methods under seven attacks, while significantly reducing defensive training requirements. The framework shows strong cross-dataset robustness and highlights diffusion models as a practical, universal defense tool for AI4EO tasks, albeit with higher inference costs and potential limitations on extreme high-contrast data.

Abstract

Deep neural networks (DNNs) have risen to prominence as key solutions in numerous AI applications for earth observation (AI4EO). However, their susceptibility to adversarial examples poses a critical challenge, compromising the reliability of AI4EO algorithms. This paper presents a novel Universal Adversarial Defense approach in Remote Sensing Imagery (UAD-RS), leveraging pre-trained diffusion models to protect DNNs against universal adversarial examples exhibiting heterogeneous patterns. Specifically, a universal adversarial purification framework is developed utilizing pre-trained diffusion models to mitigate adversarial perturbations through the introduction of Gaussian noise and subsequent purification of the perturbations from adversarial examples. Additionally, an Adaptive Noise Level Selection (ANLS) mechanism is introduced to determine the optimal noise level for the purification framework with a task-guided Frechet Inception Distance (FID) ranking strategy, thereby enhancing purification performance. Consequently, only a single pre-trained diffusion model is required for purifying universal adversarial samples with heterogeneous patterns across each dataset, significantly reducing training efforts for multiple attack settings while maintaining high performance without prior knowledge of adversarial perturbations. Experimental results on four heterogeneous RS datasets, focusing on scene classification and semantic segmentation, demonstrate that UAD-RS outperforms state-of-the-art adversarial purification approaches, providing universal defense against seven commonly encountered adversarial perturbations. Codes and the pre-trained models are available online (https://github.com/EricYu97/UAD-RS).

Universal Adversarial Defense in Remote Sensing Based on Pre-trained Denoising Diffusion Models

TL;DR

Abstract

Paper Structure (50 sections, 20 equations, 9 figures, 7 tables, 1 algorithm)

This paper contains 50 sections, 20 equations, 9 figures, 7 tables, 1 algorithm.

Introduction
Related Work
Adversarial Attacks
Fast Gradient Sign Method (FGSM)
Trade-off Projected Gradient Descent (TPGD) Attack
Carlini and Wagner (CW) Attack
Mixcut-Attack
Adversarial Purification
Diffusion Models
Methodology
Pre-training Generative Diffusion Model
Unified Adversarial Purification Framework
Adaptive Noise Level Selection
Experimental Results
Dataset Description
...and 35 more sections

Figures (9)

Figure 1: Illustration of universal adversarial defense on scene classification in remote sensing (RS) images. The adversarial examples generated with heterogeneous attack methods significantly affect the performance of different DNNs. The proposed UAD-RS aims to protect the DNNs from the universal adversarial patterns by purifying the adversarial examples in a unified model.
Figure 2: Illustration of the forward and reverse processes of the generative diffusion models in pre-training phase. The forward diffusion process gradually adds Gaussian noise to the images using the noise scheduler, and finally, pure Gaussian noise is generated. After that, the reverse process progressively recovers the noise to reconstruct an image with a denoising U-Net model.
Figure 3: Overview of the proposed UAD-RS adversarial purification framework. The UAD-RS purification process effectively removes adversarial perturbations by first disrupting them with Gaussian noise and then denoising the resulting latent to produce clean images in the forward and reverse process of the pre-trained diffusion model, respectively. Subsequently, the Adaptive Noise Level Selection (ANLS) algorithm identifies the optimal noise level $T_{m}^{*}$ that yields the best purification results, compared to other noise levels, by calculating and ranking their task-guided Frechet Inception Distance (FID) scores. Ultimately, the purified predictions circumvent the influence of the original perturbations, ensuring accurate results.
Figure 4: Qualitative comparison for adversarial purification on the UCM dataset. (a) Adversarial samples. (b) Ground truth. (c)-(f) Purified results obtained by (c) Pix2Pix. (d) PSGAN. (e) TGDN. (f) UAD-RS.
Figure 5: Qualitative comparison for adversarial purification on the AID dataset. (a) Adversarial samples. (b) Ground truth. (c)-(f) Purified results obtained by (c) Pix2Pix. (d) PSGAN. (e) TGDN. (f) UAD-RS.
...and 4 more figures

Universal Adversarial Defense in Remote Sensing Based on Pre-trained Denoising Diffusion Models

TL;DR

Abstract

Universal Adversarial Defense in Remote Sensing Based on Pre-trained Denoising Diffusion Models

Authors

TL;DR

Abstract

Table of Contents

Figures (9)