ZooplanktonBench: A Geo-Aware Zooplankton Recognition and Classification Dataset from Marine Observations
Fukun Liu, Adam T. Greer, Gengchen Mai, Jin Sun
TL;DR
ZooplanktonBench introduces a geo-aware dataset of zooplankton imagery and video captured in the northern Gulf of Mexico to support detection, fine-grained classification, and living-vs-marine-snow discrimination under varied depths and backgrounds. The benchmark defines two core tasks and two tracks—diverse-depth and image+video—evaluated with YOLOv8, open-set detectors like Grounding DINO, and GPT-4V to explore the capabilities and limits of current methods in marine environments. Results show depth and background clutter significantly affect performance, with best single-depth performance around 25 m and substantial gains when incorporating unlabeled video data, while open-set detectors and large language models face notable challenges in fine-grained, geo-aware zooplankton understanding. The work provides a valuable resource for advancing geo-aware computer vision in marine ecology and highlights directions for future research, including annotated video datasets and knowledge-grounded approaches.
Abstract
Plankton are small drifting organisms found throughout the world's oceans and can be indicators of ocean health. One component of this plankton community is the zooplankton, which includes gelatinous animals and crustaceans (e.g. shrimp), as well as the early life stages (i.e., eggs and larvae) of many commercially important fishes. Being able to monitor zooplankton abundances accurately and understand how populations change in relation to ocean conditions is invaluable to marine science research, with important implications for future marine seafood productivity. While new imaging technologies generate massive amounts of video data of zooplankton, analyzing them using general-purpose computer vision tools turns out to be highly challenging due to the high similarity in appearance between the zooplankton and its background (e.g., marine snow). In this work, we present the ZooplanktonBench, a benchmark dataset containing images and videos of zooplankton associated with rich geospatial metadata (e.g., geographic coordinates, depth, etc.) in various water ecosystems. ZooplanktonBench defines a collection of tasks to detect, classify, and track zooplankton in challenging settings, including highly cluttered environments, living vs non-living classification, objects with similar shapes, and relatively small objects. Our dataset presents unique challenges and opportunities for state-of-the-art computer vision systems to evolve and improve visual understanding in dynamic environments characterized by significant variation and the need for geo-awareness. The code and settings described in this paper can be found on our website: https://lfk118.github.io/ZooplanktonBench_Webpage.
