Table of Contents
Fetching ...

Online Continuous Generalized Category Discovery

Keon-Hee Park, Hakyung Lee, Kyungwoo Song, Gyeong-Moon Park

TL;DR

The paper addresses the challenge of discovering novel categories in online data streams by introducing Online Continuous Generalized Category Discovery (OCGCD). It proposes DEAN, a framework that combines energy-guided discovery to separate known/unknown data with variance-based feature augmentation (VFA) to enhance pseudo-labeling and an energy-based contrastive loss for discriminative online learning, supported by parameter-efficient tuning (e.g., LoRA). The problem is formalized with a base labeled session $ ext{D}^{base}$ and an incremental unlabeled session $ ext{D}^{inc}$, where the label sets overlap and unknown categories emerge over time, and results show substantial improvements over state-of-the-art baselines on multiple datasets. The work demonstrates strong clustering performance and reduced forgetting in online settings, indicating practical potential for real-world continual learning and open-set recognition.

Abstract

With the advancement of deep neural networks in computer vision, artificial intelligence (AI) is widely employed in real-world applications. However, AI still faces limitations in mimicking high-level human capabilities, such as novel category discovery, for practical use. While some methods utilizing offline continual learning have been proposed for novel category discovery, they neglect the continuity of data streams in real-world settings. In this work, we introduce Online Continuous Generalized Category Discovery (OCGCD), which considers the dynamic nature of data streams where data can be created and deleted in real time. Additionally, we propose a novel method, DEAN, Discovery via Energy guidance and feature AugmentatioN, which can discover novel categories in an online manner through energy-guided discovery and facilitate discriminative learning via energy-based contrastive loss. Furthermore, DEAN effectively pseudo-labels unlabeled data through variance-based feature augmentation. Experimental results demonstrate that our proposed DEAN achieves outstanding performance in proposed OCGCD scenario.

Online Continuous Generalized Category Discovery

TL;DR

The paper addresses the challenge of discovering novel categories in online data streams by introducing Online Continuous Generalized Category Discovery (OCGCD). It proposes DEAN, a framework that combines energy-guided discovery to separate known/unknown data with variance-based feature augmentation (VFA) to enhance pseudo-labeling and an energy-based contrastive loss for discriminative online learning, supported by parameter-efficient tuning (e.g., LoRA). The problem is formalized with a base labeled session and an incremental unlabeled session , where the label sets overlap and unknown categories emerge over time, and results show substantial improvements over state-of-the-art baselines on multiple datasets. The work demonstrates strong clustering performance and reduced forgetting in online settings, indicating practical potential for real-world continual learning and open-set recognition.

Abstract

With the advancement of deep neural networks in computer vision, artificial intelligence (AI) is widely employed in real-world applications. However, AI still faces limitations in mimicking high-level human capabilities, such as novel category discovery, for practical use. While some methods utilizing offline continual learning have been proposed for novel category discovery, they neglect the continuity of data streams in real-world settings. In this work, we introduce Online Continuous Generalized Category Discovery (OCGCD), which considers the dynamic nature of data streams where data can be created and deleted in real time. Additionally, we propose a novel method, DEAN, Discovery via Energy guidance and feature AugmentatioN, which can discover novel categories in an online manner through energy-guided discovery and facilitate discriminative learning via energy-based contrastive loss. Furthermore, DEAN effectively pseudo-labels unlabeled data through variance-based feature augmentation. Experimental results demonstrate that our proposed DEAN achieves outstanding performance in proposed OCGCD scenario.
Paper Structure (16 sections, 9 equations, 3 figures, 5 tables)

This paper contains 16 sections, 9 equations, 3 figures, 5 tables.

Figures (3)

  • Figure 1: \ref{['fig:motiv_a']} shows that existing methods recorded poor performance in online training, suggesting that prior methods cannot handle online continual learning. \ref{['fig:motiv_scenario']} shows the proposed scenario, OCGCD. As our scenario assumes batch-wise online learning, the model suffers severe forgetting and poor novel category discovery.
  • Figure 2: Overall process of the proposed DEAN framework. The energy-guided discovery splits unlabeled data into known, seen, and unseen data for better novel category discovery, while variance-based feature augmentation enhances the clustering of unseen data. $\mathcal{L}_{ec}$ facilitates better discriminative learning in the online continual learning.
  • Figure 3: Validating the effectiveness of energy scores for novel category discovery without prior knowledge about novel categories by comparison with existing methods.