Table of Contents
Fetching ...

Prototypical Hash Encoding for On-the-Fly Fine-Grained Category Discovery

Haiyang Zheng, Nan Pu, Wenjing Li, Nicu Sebe, Zhun Zhong

TL;DR

The paper tackles online fine-grained category discovery (OCD) by addressing the high sensitivity of hash-based descriptors in existing methods. It introduces Prototypical Hash Encoding (PHE), a two-stage framework consisting of Category-aware Prototype Generation (CPG) and Discriminative Hash Encoding (DCE) that maps multiple category prototypes to hash centers and enforces intra-class compactness and inter-class separation. Hash centers are constrained using a center-separation loss guided by the Gilbert–Varshamov bound, with a Hamming-ball inference strategy enabling real-time online labeling of known and unknown classes. Empirical results on eight fine-grained datasets show substantial improvements over prior works (e.g., SMILE), with enhanced stability as hash length increases and interpretable prototype visualizations illuminating why samples are categorized as known or novel. The approach offers practical benefits for open-world recognition and provides a foundation for integrating prototype-based and hash-based techniques in OCD.

Abstract

In this paper, we study a practical yet challenging task, On-the-fly Category Discovery (OCD), aiming to online discover the newly-coming stream data that belong to both known and unknown classes, by leveraging only known category knowledge contained in labeled data. Previous OCD methods employ the hash-based technique to represent old/new categories by hash codes for instance-wise inference. However, directly mapping features into low-dimensional hash space not only inevitably damages the ability to distinguish classes and but also causes "high sensitivity" issue, especially for fine-grained classes, leading to inferior performance. To address these issues, we propose a novel Prototypical Hash Encoding (PHE) framework consisting of Category-aware Prototype Generation (CPG) and Discriminative Category Encoding (DCE) to mitigate the sensitivity of hash code while preserving rich discriminative information contained in high-dimension feature space, in a two-stage projection fashion. CPG enables the model to fully capture the intra-category diversity by representing each category with multiple prototypes. DCE boosts the discrimination ability of hash code with the guidance of the generated category prototypes and the constraint of minimum separation distance. By jointly optimizing CPG and DCE, we demonstrate that these two components are mutually beneficial towards an effective OCD. Extensive experiments show the significant superiority of our PHE over previous methods, e.g., obtaining an improvement of +5.3% in ALL ACC averaged on all datasets. Moreover, due to the nature of the interpretable prototypes, we visually analyze the underlying mechanism of how PHE helps group certain samples into either known or unknown categories. Code is available at https://github.com/HaiyangZheng/PHE.

Prototypical Hash Encoding for On-the-Fly Fine-Grained Category Discovery

TL;DR

The paper tackles online fine-grained category discovery (OCD) by addressing the high sensitivity of hash-based descriptors in existing methods. It introduces Prototypical Hash Encoding (PHE), a two-stage framework consisting of Category-aware Prototype Generation (CPG) and Discriminative Hash Encoding (DCE) that maps multiple category prototypes to hash centers and enforces intra-class compactness and inter-class separation. Hash centers are constrained using a center-separation loss guided by the Gilbert–Varshamov bound, with a Hamming-ball inference strategy enabling real-time online labeling of known and unknown classes. Empirical results on eight fine-grained datasets show substantial improvements over prior works (e.g., SMILE), with enhanced stability as hash length increases and interpretable prototype visualizations illuminating why samples are categorized as known or novel. The approach offers practical benefits for open-world recognition and provides a foundation for integrating prototype-based and hash-based techniques in OCD.

Abstract

In this paper, we study a practical yet challenging task, On-the-fly Category Discovery (OCD), aiming to online discover the newly-coming stream data that belong to both known and unknown classes, by leveraging only known category knowledge contained in labeled data. Previous OCD methods employ the hash-based technique to represent old/new categories by hash codes for instance-wise inference. However, directly mapping features into low-dimensional hash space not only inevitably damages the ability to distinguish classes and but also causes "high sensitivity" issue, especially for fine-grained classes, leading to inferior performance. To address these issues, we propose a novel Prototypical Hash Encoding (PHE) framework consisting of Category-aware Prototype Generation (CPG) and Discriminative Category Encoding (DCE) to mitigate the sensitivity of hash code while preserving rich discriminative information contained in high-dimension feature space, in a two-stage projection fashion. CPG enables the model to fully capture the intra-category diversity by representing each category with multiple prototypes. DCE boosts the discrimination ability of hash code with the guidance of the generated category prototypes and the constraint of minimum separation distance. By jointly optimizing CPG and DCE, we demonstrate that these two components are mutually beneficial towards an effective OCD. Extensive experiments show the significant superiority of our PHE over previous methods, e.g., obtaining an improvement of +5.3% in ALL ACC averaged on all datasets. Moreover, due to the nature of the interpretable prototypes, we visually analyze the underlying mechanism of how PHE helps group certain samples into either known or unknown categories. Code is available at https://github.com/HaiyangZheng/PHE.

Paper Structure

This paper contains 28 sections, 10 equations, 11 figures, 15 tables, 2 algorithms.

Figures (11)

  • Figure 1: (a) Schema of Offline Category Discovery task (e.g., NCD dtc and GCD gcd). (b) Schema of On-the-fly Category Discovery task ocd, studied in this paper. (c) Previous work (e.g., SMILE ocd) based on instance-level hash encoding. (d) Our PHE explores prototype-based hash encoding. (e) Performance comparison of PHE and SMILE and observation about "High Sensitivity".
  • Figure 2: Our PHE framework is composed of the CPG and DHC modules. First, CPG generates category-specific prototypes and prototype-guided instance representations. Then, DHC encodes the generated prototypes as hash centers to encourage the model to learn discriminative instance hash codes. Finally, depending on the Hamming distance between instance hash codes and hash centers, we can obtain instant feedback and online group instances into both known and unknown categories.
  • Figure 3: Case Study: Why is a Grasshopper Sparrow classified as a new category?
  • Figure 4: Impact of hyper-parameters.
  • Figure 5: Evolution of hash centers distribution during the training process.
  • ...and 6 more figures