Table of Contents
Fetching ...

Continual Learning in Open-vocabulary Classification with Complementary Memory Systems

Zhen Zhu, Weijie Lyu, Yao Xiao, Derek Hoiem

TL;DR

This work proposes a method for flexible and efficient continual learning in open-vocabulary image classification, drawing inspiration from the complementary learning systems observed in human cognition, using the zero-shot estimated probability that a sample's class is within the exemplar classes.

Abstract

We introduce a method for flexible and efficient continual learning in open-vocabulary image classification, drawing inspiration from the complementary learning systems observed in human cognition. Specifically, we propose to combine predictions from a CLIP zero-shot model and the exemplar-based model, using the zero-shot estimated probability that a sample's class is within the exemplar classes. We also propose a "tree probe" method, an adaption of lazy learning principles, which enables fast learning from new examples with competitive accuracy to batch-trained linear models. We test in data incremental, class incremental, and task incremental settings, as well as ability to perform flexible inference on varying subsets of zero-shot and learned categories. Our proposed method achieves a good balance of learning speed, target task effectiveness, and zero-shot effectiveness. Code will be available at https://github.com/jessemelpolio/TreeProbe.

Continual Learning in Open-vocabulary Classification with Complementary Memory Systems

TL;DR

This work proposes a method for flexible and efficient continual learning in open-vocabulary image classification, drawing inspiration from the complementary learning systems observed in human cognition, using the zero-shot estimated probability that a sample's class is within the exemplar classes.

Abstract

We introduce a method for flexible and efficient continual learning in open-vocabulary image classification, drawing inspiration from the complementary learning systems observed in human cognition. Specifically, we propose to combine predictions from a CLIP zero-shot model and the exemplar-based model, using the zero-shot estimated probability that a sample's class is within the exemplar classes. We also propose a "tree probe" method, an adaption of lazy learning principles, which enables fast learning from new examples with competitive accuracy to batch-trained linear models. We test in data incremental, class incremental, and task incremental settings, as well as ability to perform flexible inference on varying subsets of zero-shot and learned categories. Our proposed method achieves a good balance of learning speed, target task effectiveness, and zero-shot effectiveness. Code will be available at https://github.com/jessemelpolio/TreeProbe.
Paper Structure (34 sections, 5 equations, 11 figures, 9 tables, 2 algorithms)

This paper contains 34 sections, 5 equations, 11 figures, 9 tables, 2 algorithms.

Figures (11)

  • Figure 1: Our work proposes a method to continually expand and improve classification ability, updating the model quickly with each new labeled example. This is especially helpful in long-tailed classification problems, like the depicted nature classification app. Given a user-provided image, the system predicts the likely classes and immediately updates given any corrections. In this way, the system becomes increasingly capable without any costly offline retraining.
  • Figure 2: Method overview. (a) Our model integrates exemplar-based and consolidated systems. Final results are made by fusing the predictions from both systems using a weighting method such as AIM. (b) Illustration of TreeProbe, which incrementally adds and hierarchically clusters examples. TreeProbe trains logistic regression classifiers using examples in updated leaf nodes. Colors of exemplars indicate different categories.
  • Figure 3: The upper row illustrates several randomly selected samples in target tasks and zero-shot tasks. In the middle and the lower row, we illustrate how data samples are organized in task, class and data incremental learning scenarios. Borders of images with different colors indicates the source of the data. Red class names in data and class incremental learning are used to highlight differences of the two settings.
  • Figure 4: Demonstration of the three inference scenarios of flexible inference. Images and visual elements are consistent with Fig. \ref{['fig:incremental_scenarios']}. Bold text in green shows the final accuracy obtained from each scenario.
  • Figure 5: (a) Results comparing CLIP zero-shot, LinProbe and CLIP+LinProbe with Avg-Emb and AIM-Emb on target tasks under the data incremental learning scenario. (b) Results of corresponding models on target tasks under class incremental learning. We visualize curves for seen and unseen classes including the overall performance identified with different markers. (c) Results of corresponding models on target tasks under task incremental learning, along with the performance of fine-tuning the whole CLIP network (CLIP Fine-tune) and ZSCL ZSCL. (d) Flexible inference results after task incremental learning on all tasks. Note LinProbe is hard to see in (a) and (b) because it has similar results as CLIP+LinProbe (Avg-Emb).
  • ...and 6 more figures