Table of Contents
Fetching ...

Data-free Knowledge Distillation for Fine-grained Visual Categorization

Renrong Shao, Wei Zhang, Jianhua Yin, Jun Wang

TL;DR

The paper tackles the challenge of fine-grained visual categorization under data-free knowledge distillation by introducing DFKD-FGVC, an adversarial framework that combines a spatially attentive generator, mixed high-order attention distillation, and semantic feature contrast learning. By synthesizing discriminative, fine-grained images and aligning high-level semantic representations between teacher and student in hyperspace, the method achieves state-of-the-art results on FGVC benchmarks without real data. Key contributions include the spatial attention generator, MHAD to model part interactions, and SFCL to maximize semantic separability, all validated through extensive experiments, ablations, and visual analyses. The approach enables privacy-preserving model compression and deployment in data-restricted settings while maintaining robust fine-grained performance.

Abstract

Data-free knowledge distillation (DFKD) is a promising approach for addressing issues related to model compression, security privacy, and transmission restrictions. Although the existing methods exploiting DFKD have achieved inspiring achievements in coarse-grained classification, in practical applications involving fine-grained classification tasks that require more detailed distinctions between similar categories, sub-optimal results are obtained. To address this issue, we propose an approach called DFKD-FGVC that extends DFKD to fine-grained visual categorization~(FGVC) tasks. Our approach utilizes an adversarial distillation framework with attention generator, mixed high-order attention distillation, and semantic feature contrast learning. Specifically, we introduce a spatial-wise attention mechanism to the generator to synthesize fine-grained images with more details of discriminative parts. We also utilize the mixed high-order attention mechanism to capture complex interactions among parts and the subtle differences among discriminative features of the fine-grained categories, paying attention to both local features and semantic context relationships. Moreover, we leverage the teacher and student models of the distillation framework to contrast high-level semantic feature maps in the hyperspace, comparing variances of different categories. We evaluate our approach on three widely-used FGVC benchmarks (Aircraft, Cars196, and CUB200) and demonstrate its superior performance.

Data-free Knowledge Distillation for Fine-grained Visual Categorization

TL;DR

The paper tackles the challenge of fine-grained visual categorization under data-free knowledge distillation by introducing DFKD-FGVC, an adversarial framework that combines a spatially attentive generator, mixed high-order attention distillation, and semantic feature contrast learning. By synthesizing discriminative, fine-grained images and aligning high-level semantic representations between teacher and student in hyperspace, the method achieves state-of-the-art results on FGVC benchmarks without real data. Key contributions include the spatial attention generator, MHAD to model part interactions, and SFCL to maximize semantic separability, all validated through extensive experiments, ablations, and visual analyses. The approach enables privacy-preserving model compression and deployment in data-restricted settings while maintaining robust fine-grained performance.

Abstract

Data-free knowledge distillation (DFKD) is a promising approach for addressing issues related to model compression, security privacy, and transmission restrictions. Although the existing methods exploiting DFKD have achieved inspiring achievements in coarse-grained classification, in practical applications involving fine-grained classification tasks that require more detailed distinctions between similar categories, sub-optimal results are obtained. To address this issue, we propose an approach called DFKD-FGVC that extends DFKD to fine-grained visual categorization~(FGVC) tasks. Our approach utilizes an adversarial distillation framework with attention generator, mixed high-order attention distillation, and semantic feature contrast learning. Specifically, we introduce a spatial-wise attention mechanism to the generator to synthesize fine-grained images with more details of discriminative parts. We also utilize the mixed high-order attention mechanism to capture complex interactions among parts and the subtle differences among discriminative features of the fine-grained categories, paying attention to both local features and semantic context relationships. Moreover, we leverage the teacher and student models of the distillation framework to contrast high-level semantic feature maps in the hyperspace, comparing variances of different categories. We evaluate our approach on three widely-used FGVC benchmarks (Aircraft, Cars196, and CUB200) and demonstrate its superior performance.
Paper Structure (18 sections, 11 equations, 7 figures, 6 tables, 1 algorithm)

This paper contains 18 sections, 11 equations, 7 figures, 6 tables, 1 algorithm.

Figures (7)

  • Figure 1: The whole framework of our approach. The left: The spatial attention module is plugged into each block of generator $\mathcal{G}$, which aims to focus on fine-grained semantic information from the whole process of noise $z$ to images $\hat{x}$. The intermediate: At each block of teacher and student, the feature maps are extracted by the mixed high-order attention module to achieve MHAD. The right: In the penultimate layer, exploiting the MLP to map the high-level semantic features of teacher and student to a common hyperspace and compare the variances by SFCL.
  • Figure 2: The spatial attention module of the generator, in which $\otimes$ denotes the element-wise multiplication and $\oplus$ denotes the element-wise addition.
  • Figure 3: The MHA module of teacher and student in distillation stage.
  • Figure 4: Visualization synthetic images generated by some representative approaches on Aircraft, Cars196, and CUB200 datasets.
  • Figure 5: Visualization of t-SNE distribution on Aircraft dataset.
  • ...and 2 more figures