Table of Contents
Fetching ...

Local Foreground Selection aware Attentive Feature Reconstruction for few-shot fine-grained plant species classification

Aisha Zulfiqar, Ebroul Izquiedro

TL;DR

The paper tackles fine-grained plant species classification in a few-shot regime by introducing Local Foreground Selection (LFS) attention, a dual-action module that combines Local Attention with Foreground Selection Attention within a vision transformer encoder. LFS creates discriminative support and query feature maps by highlighting foreground regions while capturing local spatial details, thus reducing intra-class variation and increasing inter-class separation. When paired with existing feature reconstruction networks (FRN/BiFRN), this approach achieves state-of-the-art performance on Oxford Flower-102, iNaturalist 2019 plant subset, and PlantNet 300-K across $C$-way $K$-shot tasks on Conv-4 and ResNet-12 backbones. Ablation studies show the effectiveness of combining local and foreground attentions and reveal practical considerations for FS-ratio selection.

Abstract

Plant species exhibit significant intra-class variation and minimal inter-class variation. To enhance classification accuracy, it is essential to reduce intra-class variation while maximizing inter-class variation. This paper addresses plant species classification using a limited number of labelled samples and introduces a novel Local Foreground Selection(LFS) attention mechanism. LFS is a straightforward module designed to generate discriminative support and query feature maps. It operates by integrating two types of attention: local attention, which captures local spatial details to enhance feature discrimination and increase inter-class differentiation, and foreground selection attention, which emphasizes the foreground plant object while mitigating background interference. By focusing on the foreground, the query and support features selectively highlight relevant feature sequences and disregard less significant background sequences, thereby reducing intra-class differences. Experimental results from three plant species datasets demonstrate the effectiveness of the proposed LFS attention mechanism and its complementary advantages over previous feature reconstruction methods.

Local Foreground Selection aware Attentive Feature Reconstruction for few-shot fine-grained plant species classification

TL;DR

The paper tackles fine-grained plant species classification in a few-shot regime by introducing Local Foreground Selection (LFS) attention, a dual-action module that combines Local Attention with Foreground Selection Attention within a vision transformer encoder. LFS creates discriminative support and query feature maps by highlighting foreground regions while capturing local spatial details, thus reducing intra-class variation and increasing inter-class separation. When paired with existing feature reconstruction networks (FRN/BiFRN), this approach achieves state-of-the-art performance on Oxford Flower-102, iNaturalist 2019 plant subset, and PlantNet 300-K across -way -shot tasks on Conv-4 and ResNet-12 backbones. Ablation studies show the effectiveness of combining local and foreground attentions and reveal practical considerations for FS-ratio selection.

Abstract

Plant species exhibit significant intra-class variation and minimal inter-class variation. To enhance classification accuracy, it is essential to reduce intra-class variation while maximizing inter-class variation. This paper addresses plant species classification using a limited number of labelled samples and introduces a novel Local Foreground Selection(LFS) attention mechanism. LFS is a straightforward module designed to generate discriminative support and query feature maps. It operates by integrating two types of attention: local attention, which captures local spatial details to enhance feature discrimination and increase inter-class differentiation, and foreground selection attention, which emphasizes the foreground plant object while mitigating background interference. By focusing on the foreground, the query and support features selectively highlight relevant feature sequences and disregard less significant background sequences, thereby reducing intra-class differences. Experimental results from three plant species datasets demonstrate the effectiveness of the proposed LFS attention mechanism and its complementary advantages over previous feature reconstruction methods.
Paper Structure (20 sections, 5 equations, 3 figures, 5 tables)

This paper contains 20 sections, 5 equations, 3 figures, 5 tables.

Figures (3)

  • Figure 1: Local Foreground Selection aware Attentive Feature Reconstruction Network.
  • Figure 2: Local Foreground Selection Module where features are fed into the vision transformer and outputs feature pool.
  • Figure 3: Heat map comparing the original feature maps to our proposed Local Foreground Selection attention on Oxford Flower-102 dataset.