DAPlankton: Benchmark Dataset for Multi-instrument Plankton Recognition via Fine-grained Domain Adaptation
Daniel Batrakhanov, Tuomas Eerola, Kaisa Kraft, Lumi Haraguchi, Lasse Lensu, Sanna Suikkanen, María Teresa Camarena-Gómez, Jukka Seppälä, Heikki Kälviäinen
TL;DR
DAPlankton addresses domain shift in plankton recognition caused by different imaging instruments by introducing a benchmark with two subsets, DAPlankton_LAB and DAPlankton_SEA, capturing cultured and natural Baltic Sea data across multiple instruments. The authors provide an evaluation protocol for unsupervised closed-set domain adaptation and report a preliminary benchmark of three baseline methods (Deep CORAL, CDAN, Deep MEDA) using AlexNet and ResNet-18. Findings show that existing DA methods improve over non-adaptive baselines but struggle with fine-grained, imbalanced, multi-instrument plankton data, highlighting the need for novel approaches. The dataset, publicly released, enables reproducible benchmarking and motivates the development of methods capable of robust cross-instrument plankton recognition.
Abstract
Plankton recognition provides novel possibilities to study various environmental aspects and an interesting real-world context to develop domain adaptation (DA) methods. Different imaging instruments cause domain shift between datasets hampering the development of general plankton recognition methods. A promising remedy for this is DA allowing to adapt a model trained on one instrument to other instruments. In this paper, we present a new DA dataset called DAPlankton which consists of phytoplankton images obtained with different instruments. Phytoplankton provides a challenging DA problem due to the fine-grained nature of the task and high class imbalance in real-world datasets. DAPlankton consists of two subsets. DAPlankton_LAB contains images of cultured phytoplankton providing a balanced dataset with minimal label uncertainty. DAPlankton_SEA consists of images collected from the Baltic Sea providing challenging real-world data with large intra-class variance and class imbalance. We further present a benchmark comparison of three widely used DA methods.
