Table of Contents
Fetching ...

High-Discriminative Attribute Feature Learning for Generalized Zero-Shot Learning

Yu Lei, Guoshuai Sheng, Fangfang Li, Quanxue Gao, Cheng Deng, Qin Li

TL;DR

High-Discriminative Attribute Feature Learning for Generalized Zero-Shot Learning optimizes visual features by learning attribute features to obtain discriminative visual embeddings and introduces a Transformer-based attribute discrimination encoder to enhance the discriminative capability among attributes.

Abstract

Zero-shot learning(ZSL) aims to recognize new classes without prior exposure to their samples, relying on semantic knowledge from observed classes. However, current attention-based models may overlook the transferability of visual features and the distinctiveness of attribute localization when learning regional features in images. Additionally, they often overlook shared attributes among different objects. Highly discriminative attribute features are crucial for identifying and distinguishing unseen classes. To address these issues, we propose an innovative approach called High-Discriminative Attribute Feature Learning for Generalized Zero-Shot Learning (HDAFL). HDAFL optimizes visual features by learning attribute features to obtain discriminative visual embeddings. Specifically, HDAFL utilizes multiple convolutional kernels to automatically learn discriminative regions highly correlated with attributes in images, eliminating irrelevant interference in image features. Furthermore, we introduce a Transformer-based attribute discrimination encoder to enhance the discriminative capability among attributes. Simultaneously, the method employs contrastive loss to alleviate dataset biases and enhance the transferability of visual features, facilitating better semantic transfer between seen and unseen classes. Experimental results demonstrate the effectiveness of HDAFL across three widely used datasets.

High-Discriminative Attribute Feature Learning for Generalized Zero-Shot Learning

TL;DR

High-Discriminative Attribute Feature Learning for Generalized Zero-Shot Learning optimizes visual features by learning attribute features to obtain discriminative visual embeddings and introduces a Transformer-based attribute discrimination encoder to enhance the discriminative capability among attributes.

Abstract

Zero-shot learning(ZSL) aims to recognize new classes without prior exposure to their samples, relying on semantic knowledge from observed classes. However, current attention-based models may overlook the transferability of visual features and the distinctiveness of attribute localization when learning regional features in images. Additionally, they often overlook shared attributes among different objects. Highly discriminative attribute features are crucial for identifying and distinguishing unseen classes. To address these issues, we propose an innovative approach called High-Discriminative Attribute Feature Learning for Generalized Zero-Shot Learning (HDAFL). HDAFL optimizes visual features by learning attribute features to obtain discriminative visual embeddings. Specifically, HDAFL utilizes multiple convolutional kernels to automatically learn discriminative regions highly correlated with attributes in images, eliminating irrelevant interference in image features. Furthermore, we introduce a Transformer-based attribute discrimination encoder to enhance the discriminative capability among attributes. Simultaneously, the method employs contrastive loss to alleviate dataset biases and enhance the transferability of visual features, facilitating better semantic transfer between seen and unseen classes. Experimental results demonstrate the effectiveness of HDAFL across three widely used datasets.
Paper Structure (16 sections, 12 equations, 5 figures, 4 tables)

This paper contains 16 sections, 12 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Schematic diagram of attribute embedding space. The previous approach failed to effectively distinguish between different objects with similar attributes in the embedding space(left circle), leading to domain shift issues. In order to address this problem, we propose a method that aims to enhance the embedding space's ability to capture both commonalities and differences among objects (right rectangle), thereby improving discriminability.
  • Figure 2: A depiction of the HDAFL, which optimizes visual features by learning attribute features to obtain discriminative visual embeddings.
  • Figure 3: Evaluation Results for CUB with Different Values of $\mu$ and $\varepsilon$.
  • Figure 4: Analysis of the correlation between $\sigma$ values and the results ($\%$) in ZSL (ACC) and GZSL (H) across CUB, SUN, and AWA2 datasets.
  • Figure 5: The visualization of attribute features using T-SNE on CUB and SUN datasets.