Cross-Hierarchical Bidirectional Consistency Learning for Fine-Grained Visual Classification

Pengxiang Gao; Yihao Liang; Yanzhi Song; Zhouwang Yang

Cross-Hierarchical Bidirectional Consistency Learning for Fine-Grained Visual Classification

Pengxiang Gao, Yihao Liang, Yanzhi Song, Zhouwang Yang

TL;DR

This work tackles fine-grained visual classification by exploiting inherent Tree Hierarchy without requiring extra annotations. It introduces CHBC, a framework that combines a trunk net with multiple MGEs (per hierarchy) and CAM-based attention, along with a Cross-hierarchical Bidirectional Consistency (CBC) module that enforces consistency across coarse-to-fine and fine-to-coarse predictions. The approach uses matrix orthogonal decomposition to separate level-specific information, and a Jensen-Shannon divergence-based loss to align distributions across all hierarchical levels, achieving improved wa_acc and TCR on three FGVC benchmarks. Overall, CHBC demonstrates that bidirectional hierarchical regularization plus multi-granularity feature enhancement yields more accurate and consistent fine-grained predictions, with practical impact for applications demanding multi-level label support without additional annotations.

Abstract

Fine-Grained Visual Classification (FGVC) aims to categorize closely related subclasses, a task complicated by minimal inter-class differences and significant intra-class variance. Existing methods often rely on additional annotations for image classification, overlooking the valuable information embedded in Tree Hierarchies that depict hierarchical label relationships. To leverage this knowledge to improve classification accuracy and consistency, we propose a novel Cross-Hierarchical Bidirectional Consistency Learning (CHBC) framework. The CHBC framework extracts discriminative features across various hierarchies using a specially designed module to decompose and enhance attention masks and features. We employ bidirectional consistency loss to regulate the classification outcomes across different hierarchies, ensuring label prediction consistency and reducing misclassification. Experiments on three widely used FGVC datasets validate the effectiveness of the CHBC framework. Ablation studies further investigate the application strategies of feature enhancement and consistency constraints, underscoring the significant contributions of the proposed modules.

Cross-Hierarchical Bidirectional Consistency Learning for Fine-Grained Visual Classification

TL;DR

Abstract

Cross-Hierarchical Bidirectional Consistency Learning for Fine-Grained Visual Classification

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)