Data Distribution Distilled Generative Model for Generalized Zero-Shot Recognition
Yijie Wang, Mingjian Hong, Luwen Huangfu, Sheng Huang
TL;DR
This work tackles bias toward seen data in generalized zero-shot learning by recasting GZSL as an end-to-end problem that jointly models in-distribution and out-of-distribution data. It introduces a novel $D^3GZSL$ framework comprising Feature Generation (FG), In-Distribution Dual-Space Distillation (ID$^2$SD), and Out-of-Distribution Batch Distillation (O$^2$DBD); the framework optimizes a combined objective that includes $\\mathcal{L}_{gen}$, $\\mathcal{L}_{id}$, and $\\mathcal{L}_{od}$. ID$^2$SD aligns teacher–student distributions in embedding and label spaces, while O$^2$DBD learns a low-dimensional OOD representation per batch and models cross-sample correlations to capture shared structure between seen and unseen classes. Empirical results on four GZSL benchmarks show consistent improvements over strong generative baselines, and the approach remains compatible with GAN, VAE, and diffusion-based generators, highlighting its practical impact for robust zero-shot recognition.
Abstract
In the realm of Zero-Shot Learning (ZSL), we address biases in Generalized Zero-Shot Learning (GZSL) models, which favor seen data. To counter this, we introduce an end-to-end generative GZSL framework called D$^3$GZSL. This framework respects seen and synthesized unseen data as in-distribution and out-of-distribution data, respectively, for a more balanced model. D$^3$GZSL comprises two core modules: in-distribution dual space distillation (ID$^2$SD) and out-of-distribution batch distillation (O$^2$DBD). ID$^2$SD aligns teacher-student outcomes in embedding and label spaces, enhancing learning coherence. O$^2$DBD introduces low-dimensional out-of-distribution representations per batch sample, capturing shared structures between seen and unseen categories. Our approach demonstrates its effectiveness across established GZSL benchmarks, seamlessly integrating into mainstream generative frameworks. Extensive experiments consistently showcase that D$^3$GZSL elevates the performance of existing generative GZSL methods, underscoring its potential to refine zero-shot learning practices.The code is available at: https://github.com/PJBQ/D3GZSL.git
