ClearGCD: Mitigating Shortcut Learning For Robust Generalized Category Discovery
Kailin Lyu, Jianwei He, Long Xiao, Jianing Zeng, Liang Fan, Lin Shu, Jie Hao
TL;DR
Addresses generalization and forgetting in open-world Generalized Category Discovery by pinpointing shortcut learning as a key bottleneck. Introduces ClearGCD, a plug-and-play framework with Semantic View Alignment (SVA) and Shortcut Suppression Regularization (SSR) to suppress non-semantic cues and stabilize representations. Uses a parametric clustering setup with cross-augmentation consistency and adaptive prototype alignment to improve both known and novel category recognition. Demonstrates consistent improvements over state-of-the-art baselines across CIFAR, ImageNet-100, and fine-grained datasets, validating practical impact for open-world vision tasks.
Abstract
In open-world scenarios, Generalized Category Discovery (GCD) requires identifying both known and novel categories within unlabeled data. However, existing methods often suffer from prototype confusion caused by shortcut learning, which undermines generalization and leads to forgetting of known classes. We propose ClearGCD, a framework designed to mitigate reliance on non-semantic cues through two complementary mechanisms. First, Semantic View Alignment (SVA) generates strong augmentations via cross-class patch replacement and enforces semantic consistency using weak augmentations. Second, Shortcut Suppression Regularization (SSR) maintains an adaptive prototype bank that aligns known classes while encouraging separation of potential novel ones. ClearGCD can be seamlessly integrated into parametric GCD approaches and consistently outperforms state-of-the-art methods across multiple benchmarks.
