Table of Contents
Fetching ...

Memory Consistency Guided Divide-and-Conquer Learning for Generalized Category Discovery

Yuanpeng Tu, Zhun Zhong, Yuxi Li, Hengshuang Zhao

TL;DR

This work tackles generalized category discovery by recognizing that unlabeled data can include unseen classes and that historical prediction consistency contains valuable supervisory signals. It introduces Memory Consistency guided Divide-and-Conquer Learning (MCDL), which uses two online memory banks to capture predictions from weakly and strongly augmented views and derives sample credibility from intra- and inter-memory agreement. Based on credibility, MCDL assigns unlabeled samples to high/medium/low groups and applies supervised, semi-supervised, and self-supervised losses accordingly, improving pseudo-label quality and reducing noise. Empirical results on six datasets show substantial gains over state-of-the-art methods, with notable improvements on unseen categories and semantic-shift benchmarks, and demonstrate that MCDL can serve as a general plug-in to boost existing GCD frameworks.

Abstract

Generalized category discovery (GCD) aims at addressing a more realistic and challenging setting of semi-supervised learning, where only part of the category labels are assigned to certain training samples. Previous methods generally employ naive contrastive learning or unsupervised clustering scheme for all the samples. Nevertheless, they usually ignore the inherent critical information within the historical predictions of the model being trained. Specifically, we empirically reveal that a significant number of salient unlabeled samples yield consistent historical predictions corresponding to their ground truth category. From this observation, we propose a Memory Consistency guided Divide-and-conquer Learning framework (MCDL). In this framework, we introduce two memory banks to record historical prediction of unlabeled data, which are exploited to measure the credibility of each sample in terms of its prediction consistency. With the guidance of credibility, we can design a divide-and-conquer learning strategy to fully utilize the discriminative information of unlabeled data while alleviating the negative influence of noisy labels. Extensive experimental results on multiple benchmarks demonstrate the generality and superiority of our method, where our method outperforms state-of-the-art models by a large margin on both seen and unseen classes of the generic image recognition and challenging semantic shift settings (i.e.,with +8.4% gain on CUB and +8.1% on Standford Cars).

Memory Consistency Guided Divide-and-Conquer Learning for Generalized Category Discovery

TL;DR

This work tackles generalized category discovery by recognizing that unlabeled data can include unseen classes and that historical prediction consistency contains valuable supervisory signals. It introduces Memory Consistency guided Divide-and-Conquer Learning (MCDL), which uses two online memory banks to capture predictions from weakly and strongly augmented views and derives sample credibility from intra- and inter-memory agreement. Based on credibility, MCDL assigns unlabeled samples to high/medium/low groups and applies supervised, semi-supervised, and self-supervised losses accordingly, improving pseudo-label quality and reducing noise. Empirical results on six datasets show substantial gains over state-of-the-art methods, with notable improvements on unseen categories and semantic-shift benchmarks, and demonstrate that MCDL can serve as a general plug-in to boost existing GCD frameworks.

Abstract

Generalized category discovery (GCD) aims at addressing a more realistic and challenging setting of semi-supervised learning, where only part of the category labels are assigned to certain training samples. Previous methods generally employ naive contrastive learning or unsupervised clustering scheme for all the samples. Nevertheless, they usually ignore the inherent critical information within the historical predictions of the model being trained. Specifically, we empirically reveal that a significant number of salient unlabeled samples yield consistent historical predictions corresponding to their ground truth category. From this observation, we propose a Memory Consistency guided Divide-and-conquer Learning framework (MCDL). In this framework, we introduce two memory banks to record historical prediction of unlabeled data, which are exploited to measure the credibility of each sample in terms of its prediction consistency. With the guidance of credibility, we can design a divide-and-conquer learning strategy to fully utilize the discriminative information of unlabeled data while alleviating the negative influence of noisy labels. Extensive experimental results on multiple benchmarks demonstrate the generality and superiority of our method, where our method outperforms state-of-the-art models by a large margin on both seen and unseen classes of the generic image recognition and challenging semantic shift settings (i.e.,with +8.4% gain on CUB and +8.1% on Standford Cars).
Paper Structure (21 sections, 9 equations, 6 figures, 7 tables, 1 algorithm)

This paper contains 21 sections, 9 equations, 6 figures, 7 tables, 1 algorithm.

Figures (6)

  • Figure 1: Comparison between previous methods and our proposed MCDL. MCDL is trained in a divide-and-conquer manner based on the credibility learnt from historical predictions.
  • Figure 2: The variance distribution of predictions of the 20th epoch and averaged historical predictions of 10-20 epochs on the CUB (a) and CIFAR-100 (b) datasets respectively. Label accuracy of uncertainty-based/consistency-based and our history-based method (MCDL) for the selected samples on the CUB (c) and CIFAR-100 (d) datasets. Consistency based method selects samples that have the consistent predictions between the weakly-augmented and strong-augmented views, while the uncertainty based one chooses samples that have the top-10 samples with the smallest uncertainty in each category.
  • Figure 3: An overview of the proposed MCDL method. (a) DCM: Dual-consistency Credibility Modeling (Sec. \ref{['sec:32']}). (b) DCL: Divide-and-Conquer Learning (Sec. \ref{['sec:33']}). The samples are first adaptively assigned with credibility based on their historical predictions and then tackled with three different learning strategies based on their credibility-levels in a divide-and-conquer manner.
  • Figure 4: The proposed divide-and-conquer learning strategy, where samples with three types of credibility are tackled with different schemes respectively.
  • Figure 5: Investigation of $\lambda$ and memory bank length $\mu$ on both generic image recognition (i.e., CIFAR-10/100) and semantic shift benchmarks (i.e., Standford Cars, CUB). Similar performance can be achieved across different values for both two parameters.
  • ...and 1 more figures