Table of Contents
Fetching ...

Fine-Grained Classification for Poisonous Fungi Identification with Transfer Learning

Christopher Chiu, Maximilian Heil, Teresa Kim, Anthony Miyaguchi

TL;DR

This work tackles fine-grained poisonous fungi identification under FGVC by leveraging precomputed embeddings from self-supervised vision models (notably DINOv2) and ensemble classifier heads, coupled with metadata integration. The methodology includes dataset merging (DF20/DF21), sophisticated feature engineering (cyclical date encoding, Geohash location), and a two-fold cross-validation ensemble trained on embeddings, with a composite loss combining seesaw and poison-detection terms. Key findings show that DINOv2 embeddings outperform ResNet baselines, metadata contributes incremental gains, and the best post-competition results reach 78.4% accuracy and 0.577 macro-F1 on private tests, with a Track 3 score of 0.345. The work demonstrates the viability and efficiency of embedding-based transfer learning for real-world FGVC tasks, while highlighting directions for improved domain-specific fine-tuning and more rigorous metadata integration to close remaining gaps with end-to-end methods.

Abstract

FungiCLEF 2024 addresses the fine-grained visual categorization (FGVC) of fungi species, with a focus on identifying poisonous species. This task is challenging due to the size and class imbalance of the dataset, subtle inter-class variations, and significant intra-class variability amongst samples. In this paper, we document our approach in tackling this challenge through the use of ensemble classifier heads on pre-computed image embeddings. Our team (DS@GT) demonstrate that state-of-the-art self-supervised vision models can be utilized as robust feature extractors for downstream application of computer vision tasks without the need for task-specific fine-tuning on the vision backbone. Our approach achieved the best Track 3 score (0.345), accuracy (78.4%) and macro-F1 (0.577) on the private test set in post competition evaluation. Our code is available at https://github.com/dsgt-kaggle-clef/fungiclef-2024.

Fine-Grained Classification for Poisonous Fungi Identification with Transfer Learning

TL;DR

This work tackles fine-grained poisonous fungi identification under FGVC by leveraging precomputed embeddings from self-supervised vision models (notably DINOv2) and ensemble classifier heads, coupled with metadata integration. The methodology includes dataset merging (DF20/DF21), sophisticated feature engineering (cyclical date encoding, Geohash location), and a two-fold cross-validation ensemble trained on embeddings, with a composite loss combining seesaw and poison-detection terms. Key findings show that DINOv2 embeddings outperform ResNet baselines, metadata contributes incremental gains, and the best post-competition results reach 78.4% accuracy and 0.577 macro-F1 on private tests, with a Track 3 score of 0.345. The work demonstrates the viability and efficiency of embedding-based transfer learning for real-world FGVC tasks, while highlighting directions for improved domain-specific fine-tuning and more rigorous metadata integration to close remaining gaps with end-to-end methods.

Abstract

FungiCLEF 2024 addresses the fine-grained visual categorization (FGVC) of fungi species, with a focus on identifying poisonous species. This task is challenging due to the size and class imbalance of the dataset, subtle inter-class variations, and significant intra-class variability amongst samples. In this paper, we document our approach in tackling this challenge through the use of ensemble classifier heads on pre-computed image embeddings. Our team (DS@GT) demonstrate that state-of-the-art self-supervised vision models can be utilized as robust feature extractors for downstream application of computer vision tasks without the need for task-specific fine-tuning on the vision backbone. Our approach achieved the best Track 3 score (0.345), accuracy (78.4%) and macro-F1 (0.577) on the private test set in post competition evaluation. Our code is available at https://github.com/dsgt-kaggle-clef/fungiclef-2024.
Paper Structure (18 sections, 2 equations, 4 figures, 4 tables)

This paper contains 18 sections, 2 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Sample images from the dataset. The images of fungi have high degree of variability across lighting, substrate, focus, subject, and other image features. This poses additional challenge in effectively training a classifier model.
  • Figure 2: Distribution of classes in DF20 (training set) and DF21 (validation set). We observe that the two datasets have different class distributions. Both datasets have significant class imbalance, as signified by the long tail in classes with low counts.
  • Figure 3: Dataset preparation pipeline: In order to include the unknown classes of fungi from the validation dataset into model training, we mixed the training and validation datasets into a full dataset for further dataset splitting.
  • Figure 4: Clustering of top 5 fungi species on ResNet and DINOv2 with UMAP. We observe that ResNet embeddings do not separate well, but there were clear separation of clusters in DINOv2.