Improved Feature Generating Framework for Transductive Zero-shot Learning
Zihan Ye, Xinyuan Ru, Shiming Chen, Yaochu Jin, Kaizhu Huang, Xiaobo Jin
TL;DR
This work tackles transductive zero-shot learning (TZSL) by identifying the unconditional unseen discriminator as a key source of prior-bias-induced degradation. It introduces I-VAEGAN, which pairs Pseudo-conditional Feature Adversarial (PFA) learning with Variational Embedding Regression (VER) to mitigate prior bias and improve semantic regression, respectively. Through a three-stage training regime and extensive experiments on AWA1/2, CUB, and SUN, I-VAEGAN achieves state-of-the-art TZSL and TGZSL performance across diverse unseen-class priors, while reducing Accumulated Prior Error. The proposed approach offers robust performance under unknown priors and demonstrates compatibility with existing TZSL frameworks, signaling practical impact for zero-shot recognition tasks where unseen priors are uncertain.
Abstract
Feature Generative Adversarial Networks have emerged as powerful generative models in producing high-quality representations of unseen classes within the scope of Zero-shot Learning (ZSL). This paper delves into the pivotal influence of unseen class priors within the framework of transductive ZSL (TZSL) and illuminates the finding that even a marginal prior bias can result in substantial accuracy declines. Our extensive analysis uncovers that this inefficacy fundamentally stems from the utilization of an unconditional unseen discriminator - a core component in existing TZSL. We further establish that the detrimental effects of this component are inevitable unless the generator perfectly fits class-specific distributions. Building on these insights, we introduce our Improved Feature Generation Framework, termed I-VAEGAN, which incorporates two novel components: Pseudo-conditional Feature Adversarial (PFA) learning and Variational Embedding Regression (VER). PFA circumvents the need for prior estimation by explicitly injecting the predicted semantics as pseudo conditions for unseen classes premised by precise semantic regression. Meanwhile, VER utilizes reconstructive pre-training to learn class statistics, obtaining better semantic regression. Our I-VAEGAN achieves state-of-the-art TZSL accuracy across various benchmarks and priors. Our code would be released upon acceptance.
