Table of Contents
Fetching ...

Brain Inspired Adaptive Memory Dual-Net for Few-Shot Image Classification

Kexin Di, Xiuxing Li, Yuyang Han, Ziyu Li, Qing Li, Xia Wu

TL;DR

SCAM-Net tackles supervision collapse in few-shot image classification by emulating a brain-inspired complementary learning system. It introduces a Hippocampus-Neocortex dual-network whose short-term adaptations are gradually consolidated into a long-term memory, with adaptive regulation of memory guided by generalization principles. The method combines dense local-feature processing with global CLS-token regulation, and employs an EMA-based interaction to stabilize learning. Experimental results on four benchmarks demonstrate state-of-the-art performance and improved robustness to data drift, suggesting practical value for real-world few-shot recognition tasks.

Abstract

Few-shot image classification has become a popular research topic for its wide application in real-world scenarios, however the problem of supervision collapse induced by single image-level annotation remains a major challenge. Existing methods aim to tackle this problem by locating and aligning relevant local features. However, the high intra-class variability in real-world images poses significant challenges in locating semantically relevant local regions under few-shot settings. Drawing inspiration from the human's complementary learning system, which excels at rapidly capturing and integrating semantic features from limited examples, we propose the generalization-optimized Systems Consolidation Adaptive Memory Dual-Network, SCAM-Net. This approach simulates the systems consolidation of complementary learning system with an adaptive memory module, which successfully addresses the difficulty of identifying meaningful features in few-shot scenarios. Specifically, we construct a Hippocampus-Neocortex dual-network that consolidates structured representation of each category, the structured representation is then stored and adaptively regulated following the generalization optimization principle in a long-term memory inside Neocortex. Extensive experiments on benchmark datasets show that the proposed model has achieved state-of-the-art performance.

Brain Inspired Adaptive Memory Dual-Net for Few-Shot Image Classification

TL;DR

SCAM-Net tackles supervision collapse in few-shot image classification by emulating a brain-inspired complementary learning system. It introduces a Hippocampus-Neocortex dual-network whose short-term adaptations are gradually consolidated into a long-term memory, with adaptive regulation of memory guided by generalization principles. The method combines dense local-feature processing with global CLS-token regulation, and employs an EMA-based interaction to stabilize learning. Experimental results on four benchmarks demonstrate state-of-the-art performance and improved robustness to data drift, suggesting practical value for real-world few-shot recognition tasks.

Abstract

Few-shot image classification has become a popular research topic for its wide application in real-world scenarios, however the problem of supervision collapse induced by single image-level annotation remains a major challenge. Existing methods aim to tackle this problem by locating and aligning relevant local features. However, the high intra-class variability in real-world images poses significant challenges in locating semantically relevant local regions under few-shot settings. Drawing inspiration from the human's complementary learning system, which excels at rapidly capturing and integrating semantic features from limited examples, we propose the generalization-optimized Systems Consolidation Adaptive Memory Dual-Network, SCAM-Net. This approach simulates the systems consolidation of complementary learning system with an adaptive memory module, which successfully addresses the difficulty of identifying meaningful features in few-shot scenarios. Specifically, we construct a Hippocampus-Neocortex dual-network that consolidates structured representation of each category, the structured representation is then stored and adaptively regulated following the generalization optimization principle in a long-term memory inside Neocortex. Extensive experiments on benchmark datasets show that the proposed model has achieved state-of-the-art performance.

Paper Structure

This paper contains 13 sections, 8 equations, 4 figures, 6 tables, 1 algorithm.

Figures (4)

  • Figure 1: The label annotated 'sign' but recognized as 'dog' or 'woman', meaning patterns that irrelevant to the test class but seen in training set are overemphasized. SCAM-Net aims to only consolidate the useful patches and ignore irrelevant elements as much as possible. Image source: miniImagenetvinyals2016matching
  • Figure 2: Overall architecture of the proposed SCAM-Net. ① After obtaining the patches of support set and query set, a [cls] embedding was concatenated with projected patches as input to both the Neocortex model and Hippocampus model. ② Inside the Neocortex model and Hippocampus model, the calculation of the similarity matrix between the patch embedding of support and query set embeddings obtained by ViTs is performed in parallel, block-diagonal masking is employed to prevent classifying the image itself. ③ Based on which, we can get the prediction logits of Neocortex model and Hippocampus model and the consistency loss $L_{Cons}$ is calculated. ④ The Neocortex model maintains a long-term memory that is auto-regulated due to each task. ⑤ The regulated support CLS is obtained from Systems consolidation, and construct a correlation between the query CLS, sequently we obtain the loss $L_{cls}$. ⑥Afterwards, $L_{Cons}$ and $L_{cls}$ plus $L_{CE}$ together update the Hippocampus model. ⑦ the Neocortex model perform consolidation to update.
  • Figure 3: Impact of adaptive memory regulation (MR) on few-shot image classification performance.
  • Figure 4: Visualization of support set CLS tokens before/after adaptive memory regulation of four randomly sampled 5-way 1-shot classification tasks. (a), (b), (c) and (d) show the visualization results before memory regulation. (e), (f), (g) and (h) show the corresponding results after memory regulation. Adaptive memory make the image's representation of certain class aggregated, hence the information of support set is less biased.