Table of Contents
Fetching ...

Bridging Optimal Transport and Jacobian Regularization by Optimal Trajectory for Enhanced Adversarial Defense

Binh M. Le, Shahroz Tariq, Simon S. Woo

TL;DR

This work tackles adversarial vulnerability in vision models by comparing Adversarial Training (AT) and Jacobian Regularization (JR) and proposing a unified defense, OTJR, that uses the Sliced Wasserstein distance to compute optimal latent trajectories. By replacing random projections with informative trajectories in Jacobian regularization and jointly minimizing the transport distance between clean and adversarial representations, OTJR achieves strong robustness on CIFAR-10/100 with competitive clean accuracy and favorable convergence behavior. The approach is validated across white-box and black-box attacks, online adversarial scenarios, and large-scale datasets, and is shown to be compatible with existing defense frameworks. The findings demonstrate that integrating optimal transport insights into adversarial defenses yields practical improvements and real-world resilience, highlighting the method's significance for secure deployment of deep learning systems.

Abstract

Deep neural networks, particularly in vision tasks, are notably susceptible to adversarial perturbations. To overcome this challenge, developing a robust classifier is crucial. In light of the recent advancements in the robustness of classifiers, we delve deep into the intricacies of adversarial training and Jacobian regularization, two pivotal defenses. Our work is the first carefully analyzes and characterizes these two schools of approaches, both theoretically and empirically, to demonstrate how each approach impacts the robust learning of a classifier. Next, we propose our novel Optimal Transport with Jacobian regularization method, dubbed OTJR, bridging the input Jacobian regularization with the a output representation alignment by leveraging the optimal transport theory. In particular, we employ the Sliced Wasserstein distance that can efficiently push the adversarial samples' representations closer to those of clean samples, regardless of the number of classes within the dataset. The SW distance provides the adversarial samples' movement directions, which are much more informative and powerful for the Jacobian regularization. Our empirical evaluations set a new standard in the domain, with our method achieving commendable accuracies of 52.57% on CIFAR-10 and 28.3% on CIFAR-100 datasets under the AutoAttack. Further validating our model's practicality, we conducted real-world tests by subjecting internet-sourced images to online adversarial attacks. These demonstrations highlight our model's capability to counteract sophisticated adversarial perturbations, affirming its significance and applicability in real-world scenarios.

Bridging Optimal Transport and Jacobian Regularization by Optimal Trajectory for Enhanced Adversarial Defense

TL;DR

This work tackles adversarial vulnerability in vision models by comparing Adversarial Training (AT) and Jacobian Regularization (JR) and proposing a unified defense, OTJR, that uses the Sliced Wasserstein distance to compute optimal latent trajectories. By replacing random projections with informative trajectories in Jacobian regularization and jointly minimizing the transport distance between clean and adversarial representations, OTJR achieves strong robustness on CIFAR-10/100 with competitive clean accuracy and favorable convergence behavior. The approach is validated across white-box and black-box attacks, online adversarial scenarios, and large-scale datasets, and is shown to be compatible with existing defense frameworks. The findings demonstrate that integrating optimal transport insights into adversarial defenses yields practical improvements and real-world resilience, highlighting the method's significance for secure deployment of deep learning systems.

Abstract

Deep neural networks, particularly in vision tasks, are notably susceptible to adversarial perturbations. To overcome this challenge, developing a robust classifier is crucial. In light of the recent advancements in the robustness of classifiers, we delve deep into the intricacies of adversarial training and Jacobian regularization, two pivotal defenses. Our work is the first carefully analyzes and characterizes these two schools of approaches, both theoretically and empirically, to demonstrate how each approach impacts the robust learning of a classifier. Next, we propose our novel Optimal Transport with Jacobian regularization method, dubbed OTJR, bridging the input Jacobian regularization with the a output representation alignment by leveraging the optimal transport theory. In particular, we employ the Sliced Wasserstein distance that can efficiently push the adversarial samples' representations closer to those of clean samples, regardless of the number of classes within the dataset. The SW distance provides the adversarial samples' movement directions, which are much more informative and powerful for the Jacobian regularization. Our empirical evaluations set a new standard in the domain, with our method achieving commendable accuracies of 52.57% on CIFAR-10 and 28.3% on CIFAR-100 datasets under the AutoAttack. Further validating our model's practicality, we conducted real-world tests by subjecting internet-sourced images to online adversarial attacks. These demonstrations highlight our model's capability to counteract sophisticated adversarial perturbations, affirming its significance and applicability in real-world scenarios.
Paper Structure (29 sections, 16 equations, 8 figures, 11 tables)

This paper contains 29 sections, 16 equations, 8 figures, 11 tables.

Figures (8)

  • Figure 1: Illustration of (top) two popular approaches to boost a model's robustness: Adversarial Training (AT) vs. Jacobian regularization (JR), and (bottom) our OTJR method. Jacobian regularization tries to silence the Jacobian matrix at the input end. The AT adjusts the distribution of perturbed samples at the output end. In conventional approach, Jacobian regularization backpropagates through random projections, whereas the AT via a loss function. Our proposed OTJR bridges AT and Jacobian regularization on framework by the optimal transport theory (Sliced Wasserstein distance).
  • Figure 2: The magnitude of activation at the penultimate layer for models trained with $\mathcal{XE}$ loss, PGD-AT adversarial training, and the input-output Jacobian regularization. The channels in the X-axis are sorted in descending order of the clean samples' magnitude.
  • Figure 3: Magnitude of $||\nabla_{\tilde{x}} \mathcal{L}_{\mathcal{XE}}||_1$ at the input layer for a model trained with PGD-AT and Jacobian regularization. Red and green-filled areas range from min. to max. values of each sample.
  • Figure 4: Ratios of $\mathbb{E}(||\nabla_{\theta_i}\mathcal{L}(\tilde{x})|| / ||\nabla_{\theta_i}\mathcal{L}({x})||)$ w.r.t. the model's parameters $\theta_{i}$ on C IFAR-10. The lower the ratios are, the more emphasis the model puts on perturbations.
  • Figure 5: Illustration of optimal latent trajectories. Top row ((a) & (b)): Random movement directions (green arrows), which is non-informative, uniformly sampled from two-dimensional unit sphere $\mathcal{S}^{1}$. Bottom row ((c) & (d)): The optimal trajectories from the SW distance between source distribution (green) and target distribution (orange) obtained from Eq. \ref{['eqn:opt_move']}.
  • ...and 3 more figures