Table of Contents
Fetching ...

Novel Class Discovery for Ultra-Fine-Grained Visual Categorization

Yu Liu, Yaqi Cai, Qi Jia, Binglin Qiu, Weimin Wang, Nan Pu

TL;DR

This work addresses ultra-fine-grained visual categorization in a semi-supervised setting by proposing UFG-NCD, a novel task that discovers new categories from unlabeled data using partially labeled Ultra-FGVC data. It introduces Region-Aligned Proxy Learning (RAPL), which combines Channel-wise Region Alignment (CRA) for local discriminative features with a Semi-Supervised Proxy Learning (SemiPL) framework that leverages class proxies for proxy-guided supervised and contrastive learning. The approach yields state-of-the-art results across five SoyAgeing Ultra-FGVC datasets, demonstrating robust transfer of knowledge from labeled to unlabeled ultra-fine-grained classes and strong performance under both task-agnostic and task-aware protocols. The methods show clear contributions from modeling regional features and proxy-based distribution learning, with practical implications for scalable Ultra-FGVC in real-world domains such as precision agriculture.

Abstract

Ultra-fine-grained visual categorization (Ultra-FGVC) aims at distinguishing highly similar sub-categories within fine-grained objects, such as different soybean cultivars. Compared to traditional fine-grained visual categorization, Ultra-FGVC encounters more hurdles due to the small inter-class and large intra-class variation. Given these challenges, relying on human annotation for Ultra-FGVC is impractical. To this end, our work introduces a novel task termed Ultra-Fine-Grained Novel Class Discovery (UFG-NCD), which leverages partially annotated data to identify new categories of unlabeled images for Ultra-FGVC. To tackle this problem, we devise a Region-Aligned Proxy Learning (RAPL) framework, which comprises a Channel-wise Region Alignment (CRA) module and a Semi-Supervised Proxy Learning (SemiPL) strategy. The CRA module is designed to extract and utilize discriminative features from local regions, facilitating knowledge transfer from labeled to unlabeled classes. Furthermore, SemiPL strengthens representation learning and knowledge transfer with proxy-guided supervised learning and proxy-guided contrastive learning. Such techniques leverage class distribution information in the embedding space, improving the mining of subtle differences between labeled and unlabeled ultra-fine-grained classes. Extensive experiments demonstrate that RAPL significantly outperforms baselines across various datasets, indicating its effectiveness in handling the challenges of UFG-NCD. Code is available at https://github.com/SSDUT-Caiyq/UFG-NCD.

Novel Class Discovery for Ultra-Fine-Grained Visual Categorization

TL;DR

This work addresses ultra-fine-grained visual categorization in a semi-supervised setting by proposing UFG-NCD, a novel task that discovers new categories from unlabeled data using partially labeled Ultra-FGVC data. It introduces Region-Aligned Proxy Learning (RAPL), which combines Channel-wise Region Alignment (CRA) for local discriminative features with a Semi-Supervised Proxy Learning (SemiPL) framework that leverages class proxies for proxy-guided supervised and contrastive learning. The approach yields state-of-the-art results across five SoyAgeing Ultra-FGVC datasets, demonstrating robust transfer of knowledge from labeled to unlabeled ultra-fine-grained classes and strong performance under both task-agnostic and task-aware protocols. The methods show clear contributions from modeling regional features and proxy-based distribution learning, with practical implications for scalable Ultra-FGVC in real-world domains such as precision agriculture.

Abstract

Ultra-fine-grained visual categorization (Ultra-FGVC) aims at distinguishing highly similar sub-categories within fine-grained objects, such as different soybean cultivars. Compared to traditional fine-grained visual categorization, Ultra-FGVC encounters more hurdles due to the small inter-class and large intra-class variation. Given these challenges, relying on human annotation for Ultra-FGVC is impractical. To this end, our work introduces a novel task termed Ultra-Fine-Grained Novel Class Discovery (UFG-NCD), which leverages partially annotated data to identify new categories of unlabeled images for Ultra-FGVC. To tackle this problem, we devise a Region-Aligned Proxy Learning (RAPL) framework, which comprises a Channel-wise Region Alignment (CRA) module and a Semi-Supervised Proxy Learning (SemiPL) strategy. The CRA module is designed to extract and utilize discriminative features from local regions, facilitating knowledge transfer from labeled to unlabeled classes. Furthermore, SemiPL strengthens representation learning and knowledge transfer with proxy-guided supervised learning and proxy-guided contrastive learning. Such techniques leverage class distribution information in the embedding space, improving the mining of subtle differences between labeled and unlabeled ultra-fine-grained classes. Extensive experiments demonstrate that RAPL significantly outperforms baselines across various datasets, indicating its effectiveness in handling the challenges of UFG-NCD. Code is available at https://github.com/SSDUT-Caiyq/UFG-NCD.
Paper Structure (17 sections, 12 equations, 6 figures, 4 tables)

This paper contains 17 sections, 12 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Visual conception of UFG-NCD, where $D^l,D^u$ are labeled and unlabeled data splits. We propose to exploit novel class discovery for partially annotated UFG images. The core idea is to learn prior knowledge from labeled images via supervised learning, subsequently extending the knowledge to unlabeled images via a semi-supervised framework. It facilitates knowledge transfer and representation learning across ultra-fine-grained classes.
  • Figure 2: Overview of our Region-Aligned Proxy Learning (RAPL) framework. We first group and align extracted feature maps in the set $Q$ with regions via Channel-wise Region Alignment. Then, we generate global representation $v$ through global average pooling and MLP projection $h$, and lastly learn and transfer knowledge with the proxy bank using semi-supervised proxy learning (SemiPL).
  • Figure 3: Illustration of proposed SemiPL. We first introduce a supervised paradigm, indicated by blue lines, which includes $\mathcal{L}^{PC}$ and $\mathcal{L}^{REG}$, representing the loss of proxy-guided classification and proxy regularization, respectively. Then, we propose PCL with global distribution guidance of $\mathcal{P}^{old}$ and $\mathcal{P}^{new}$ to learn and discover new classes from unlabeled data, indicated in yellow lines.
  • Figure 4: Impact of hyper-parameters on ACC with respect to "All" instances in the test dataset of SoyAgeing-R1.
  • Figure 5: Visualization of features distributions of 20 unlabeled classes from SoyAgeing-R5.
  • ...and 1 more figures