Table of Contents
Fetching ...

Foundation Model or Finetune? Evaluation of few-shot semantic segmentation for river pollution

Marga Don, Stijn Pinson, Blanca Guillen Cebrian, Yuki M. Asano

TL;DR

It is seen that finetuned models consistently outperform the FMs tested, even in cases were data is scarce, even in cases were data is scarce.

Abstract

Foundation models (FMs) are a popular topic of research in AI. Their ability to generalize to new tasks and datasets without retraining or needing an abundance of data makes them an appealing candidate for applications on specialist datasets. In this work, we compare the performance of FMs to finetuned pre-trained supervised models in the task of semantic segmentation on an entirely new dataset. We see that finetuned models consistently outperform the FMs tested, even in cases were data is scarce. We release the code and dataset for this work on GitHub.

Foundation Model or Finetune? Evaluation of few-shot semantic segmentation for river pollution

TL;DR

It is seen that finetuned models consistently outperform the FMs tested, even in cases were data is scarce, even in cases were data is scarce.

Abstract

Foundation models (FMs) are a popular topic of research in AI. Their ability to generalize to new tasks and datasets without retraining or needing an abundance of data makes them an appealing candidate for applications on specialist datasets. In this work, we compare the performance of FMs to finetuned pre-trained supervised models in the task of semantic segmentation on an entirely new dataset. We see that finetuned models consistently outperform the FMs tested, even in cases were data is scarce. We release the code and dataset for this work on GitHub.
Paper Structure (33 sections, 2 equations, 13 figures, 8 tables)

This paper contains 33 sections, 2 equations, 13 figures, 8 tables.

Figures (13)

  • Figure 1: Example images from the 6 locations in the dataset (upper) with ground truth annotations (lower). Yellow denotes in-system trash, pink denotes out-system trash, light blue denotes water and dark blue denotes the barrier.
  • Figure 2: Example performance of all 4 models using an image from Location 2
  • Figure 3: BMC vs mIoU-In% for SegGPT on all locations, with constant prompt image and mask.
  • Figure 4: MFS vs mIoU-In% for PerSAM-F on all locations, with constant prompt image and mask.
  • Figure 5: Performance of YOLOv8 model finetuned on different sizes of training data. Models were all evaluated on the same 20% test split.
  • ...and 8 more figures