Table of Contents
Fetching ...

SRasP: Self-Reorientation Adversarial Style Perturbation for Cross-Domain Few-Shot Learning

Wenqian Li, Pengfei Fang, Hui Xue

TL;DR

A novel crop-global style perturbation network, termed Self-Reorientation Adversarial Adversarial Perturbation (SRasP), which leverages global semantic guidance to identify incoherent crops, followed by reorienting and aggregating the style gradients of these crops with the global style gradients within one image.

Abstract

Cross-Domain Few-Shot Learning (CD-FSL) aims to transfer knowledge from a seen source domain to unseen target domains, serving as a key benchmark for evaluating the robustness and transferability of models. Existing style-based perturbation methods mitigate domain shift but often suffer from gradient instability and convergence to sharp minima.To address these limitations, we propose a novel crop-global style perturbation network, termed Self-Reorientation Adversarial \underline{S}tyle \underline{P}erturbation (SRasP). Specifically, SRasP leverages global semantic guidance to identify incoherent crops, followed by reorienting and aggregating the style gradients of these crops with the global style gradients within one image. Furthermore, we propose a novel multi-objective optimization function to maximize visual discrepancy while enforcing semantic consistency among global, crop, and adversarial features. Applying the stabilized perturbations during training encourages convergence toward flatter and more transferable solutions, improving generalization to unseen domains. Extensive experiments are conducted on multiple CD-FSL benchmarks, demonstrating consistent improvements over state-of-the-art methods.

SRasP: Self-Reorientation Adversarial Style Perturbation for Cross-Domain Few-Shot Learning

TL;DR

A novel crop-global style perturbation network, termed Self-Reorientation Adversarial Adversarial Perturbation (SRasP), which leverages global semantic guidance to identify incoherent crops, followed by reorienting and aggregating the style gradients of these crops with the global style gradients within one image.

Abstract

Cross-Domain Few-Shot Learning (CD-FSL) aims to transfer knowledge from a seen source domain to unseen target domains, serving as a key benchmark for evaluating the robustness and transferability of models. Existing style-based perturbation methods mitigate domain shift but often suffer from gradient instability and convergence to sharp minima.To address these limitations, we propose a novel crop-global style perturbation network, termed Self-Reorientation Adversarial \underline{S}tyle \underline{P}erturbation (SRasP). Specifically, SRasP leverages global semantic guidance to identify incoherent crops, followed by reorienting and aggregating the style gradients of these crops with the global style gradients within one image. Furthermore, we propose a novel multi-objective optimization function to maximize visual discrepancy while enforcing semantic consistency among global, crop, and adversarial features. Applying the stabilized perturbations during training encourages convergence toward flatter and more transferable solutions, improving generalization to unseen domains. Extensive experiments are conducted on multiple CD-FSL benchmarks, demonstrating consistent improvements over state-of-the-art methods.
Paper Structure (33 sections, 28 equations, 7 figures, 7 tables)

This paper contains 33 sections, 28 equations, 7 figures, 7 tables.

Figures (7)

  • Figure 1: (a) Given an input image, multiple local crops are extracted. The training process applies one of three crop selection strategies, including random-crop selection, concept-crop selection and incoherent-crop selection, each leading to different gradient directions during optimization. (b) The gradient cosine similarity across training epochs shows that the proposed method maintains consistently higher stability compared with other perturbation methods, indicating a more reliable update trajectory. (c) Loss-surface visualizations further demonstrate that our incoherent-crop perturbation drives the model toward flatter and more generalizable minima.
  • Figure 2: Comparison between Vanilla Global Perturbation and our Self-Reorientation Perturbation. Left: In the vanilla approach, a global style perturbation is directly applied to the feature map of the entire input. Right: In our Self-Reorientation Perturbation, the input image is first divided into incoherent crops. These crop style gradients are then reoriented and aggregated with the global style gradient.
  • Figure 3: Overview of the proposed SRasP. SRasP first samples multiple localized crops from an input image and identifies incoherent regions that exhibit semantic inconsistency with the global content. Style gradients are extracted from both crop-level and global features, where incoherent crop gradients are reoriented toward the global semantic direction and aggregated through a self-reorientation gradient ensemble to resolve gradient conflicts. The resulting stable and semantically guided global style gradient is then used to synthesize hard yet meaningful adversarial style perturbations via AdaIN. In addition, a consistency–discrepancy triplet objective is employed to maximize visual style diversity while preserving semantic alignment among global, crop, and adversarial representations, enabling SRasP to generate stronger style variations and improve cross-domain generalization.
  • Figure 4: Meta-training loss curves of the baseline and the proposed SRasP.
  • Figure 5: Performances on different numbers of crops $k$.
  • ...and 2 more figures