Table of Contents
Fetching ...

Multi-Level Correlation Network For Few-Shot Image Classification

Yunkai Dang, Min Zhang, Zhengyu Chen, Xinliang Zhang, Zheng Wang, Meijun Sun, Donglin Wang

TL;DR

The paper addresses FSIC by acknowledging limitations of image-global metric learning under distribution shifts and background distractions. It introduces the multi-level correlation network (MLCN), which combines a self-correlation module, a cross-correlation module, and a pattern-correlation module to capture rich local information and structural patterns between base and novel classes under the episodic $N$-way $K$-shot regime. The method jointly optimizes an anchor-based cross-entropy loss with the three correlation losses, and ablation studies confirm significant gains from each component, achieving state-of-the-art results on four standard FSIC benchmarks. This approach enhances the transferability of embeddings by focusing on foreground and local structure, with practical impact for rapid learning in new classes from few examples.

Abstract

Few-shot image classification(FSIC) aims to recognize novel classes given few labeled images from base classes. Recent works have achieved promising classification performance, especially for metric-learning methods, where a measure at only image feature level is usually used. In this paper, we argue that measure at such a level may not be effective enough to generalize from base to novel classes when using only a few images. Instead, a multi-level descriptor of an image is taken for consideration in this paper. We propose a multi-level correlation network (MLCN) for FSIC to tackle this problem by effectively capturing local information. Concretely, we present the self-correlation module and cross-correlation module to learn the semantic correspondence relation of local information based on learned representations. Moreover, we propose a pattern-correlation module to capture the pattern of fine-grained images and find relevant structural patterns between base classes and novel classes. Extensive experiments and analysis show the effectiveness of our proposed method on four widely-used FSIC benchmarks. The code for our approach is available at: https://github.com/Yunkai696/MLCN.

Multi-Level Correlation Network For Few-Shot Image Classification

TL;DR

The paper addresses FSIC by acknowledging limitations of image-global metric learning under distribution shifts and background distractions. It introduces the multi-level correlation network (MLCN), which combines a self-correlation module, a cross-correlation module, and a pattern-correlation module to capture rich local information and structural patterns between base and novel classes under the episodic -way -shot regime. The method jointly optimizes an anchor-based cross-entropy loss with the three correlation losses, and ablation studies confirm significant gains from each component, achieving state-of-the-art results on four standard FSIC benchmarks. This approach enhances the transferability of embeddings by focusing on foreground and local structure, with practical impact for rapid learning in new classes from few examples.

Abstract

Few-shot image classification(FSIC) aims to recognize novel classes given few labeled images from base classes. Recent works have achieved promising classification performance, especially for metric-learning methods, where a measure at only image feature level is usually used. In this paper, we argue that measure at such a level may not be effective enough to generalize from base to novel classes when using only a few images. Instead, a multi-level descriptor of an image is taken for consideration in this paper. We propose a multi-level correlation network (MLCN) for FSIC to tackle this problem by effectively capturing local information. Concretely, we present the self-correlation module and cross-correlation module to learn the semantic correspondence relation of local information based on learned representations. Moreover, we propose a pattern-correlation module to capture the pattern of fine-grained images and find relevant structural patterns between base classes and novel classes. Extensive experiments and analysis show the effectiveness of our proposed method on four widely-used FSIC benchmarks. The code for our approach is available at: https://github.com/Yunkai696/MLCN.

Paper Structure

This paper contains 13 sections, 13 equations, 3 figures, 7 tables.

Figures (3)

  • Figure 1: 5way 1shot and 5shot performance on different background(BG) and foreground(FG) using two backbones of Conv4 and ResNet12 ProNet on miniImagenet datasets.
  • Figure 2: The overview of our multi-level correlation network (MLCN) for few-shot image classification. The base representations, $\mathbf{F}_{q}$ and $\mathbf{F}_{s}$ are the base representations of the backbone feature extractor. MLCN minimizes the self-correlation module loss $\mathcal{L}_{SC}$, the cross-correlation module loss $\mathcal{L}_{CC}$, the pattern-correlation module loss $\mathcal{L}_{PC}$ and the anchor-based classification loss $\mathcal{L}_{CE}$ to improve the transferability of embedding and capture enough local information.
  • Figure 3: GradCAM selvaraju2017grad visualization of the self-correlation module loss $\mathcal{L}_{SC}$, the cross-correlation module loss $\mathcal{L}_{CC}$ and the pattern-correlation loss $\mathcal{L}_{PC}$. The overall is the combined loss used in our MLCN.