Table of Contents
Fetching ...

ZooplanktonBench: A Geo-Aware Zooplankton Recognition and Classification Dataset from Marine Observations

Fukun Liu, Adam T. Greer, Gengchen Mai, Jin Sun

TL;DR

ZooplanktonBench introduces a geo-aware dataset of zooplankton imagery and video captured in the northern Gulf of Mexico to support detection, fine-grained classification, and living-vs-marine-snow discrimination under varied depths and backgrounds. The benchmark defines two core tasks and two tracks—diverse-depth and image+video—evaluated with YOLOv8, open-set detectors like Grounding DINO, and GPT-4V to explore the capabilities and limits of current methods in marine environments. Results show depth and background clutter significantly affect performance, with best single-depth performance around 25 m and substantial gains when incorporating unlabeled video data, while open-set detectors and large language models face notable challenges in fine-grained, geo-aware zooplankton understanding. The work provides a valuable resource for advancing geo-aware computer vision in marine ecology and highlights directions for future research, including annotated video datasets and knowledge-grounded approaches.

Abstract

Plankton are small drifting organisms found throughout the world's oceans and can be indicators of ocean health. One component of this plankton community is the zooplankton, which includes gelatinous animals and crustaceans (e.g. shrimp), as well as the early life stages (i.e., eggs and larvae) of many commercially important fishes. Being able to monitor zooplankton abundances accurately and understand how populations change in relation to ocean conditions is invaluable to marine science research, with important implications for future marine seafood productivity. While new imaging technologies generate massive amounts of video data of zooplankton, analyzing them using general-purpose computer vision tools turns out to be highly challenging due to the high similarity in appearance between the zooplankton and its background (e.g., marine snow). In this work, we present the ZooplanktonBench, a benchmark dataset containing images and videos of zooplankton associated with rich geospatial metadata (e.g., geographic coordinates, depth, etc.) in various water ecosystems. ZooplanktonBench defines a collection of tasks to detect, classify, and track zooplankton in challenging settings, including highly cluttered environments, living vs non-living classification, objects with similar shapes, and relatively small objects. Our dataset presents unique challenges and opportunities for state-of-the-art computer vision systems to evolve and improve visual understanding in dynamic environments characterized by significant variation and the need for geo-awareness. The code and settings described in this paper can be found on our website: https://lfk118.github.io/ZooplanktonBench_Webpage.

ZooplanktonBench: A Geo-Aware Zooplankton Recognition and Classification Dataset from Marine Observations

TL;DR

ZooplanktonBench introduces a geo-aware dataset of zooplankton imagery and video captured in the northern Gulf of Mexico to support detection, fine-grained classification, and living-vs-marine-snow discrimination under varied depths and backgrounds. The benchmark defines two core tasks and two tracks—diverse-depth and image+video—evaluated with YOLOv8, open-set detectors like Grounding DINO, and GPT-4V to explore the capabilities and limits of current methods in marine environments. Results show depth and background clutter significantly affect performance, with best single-depth performance around 25 m and substantial gains when incorporating unlabeled video data, while open-set detectors and large language models face notable challenges in fine-grained, geo-aware zooplankton understanding. The work provides a valuable resource for advancing geo-aware computer vision in marine ecology and highlights directions for future research, including annotated video datasets and knowledge-grounded approaches.

Abstract

Plankton are small drifting organisms found throughout the world's oceans and can be indicators of ocean health. One component of this plankton community is the zooplankton, which includes gelatinous animals and crustaceans (e.g. shrimp), as well as the early life stages (i.e., eggs and larvae) of many commercially important fishes. Being able to monitor zooplankton abundances accurately and understand how populations change in relation to ocean conditions is invaluable to marine science research, with important implications for future marine seafood productivity. While new imaging technologies generate massive amounts of video data of zooplankton, analyzing them using general-purpose computer vision tools turns out to be highly challenging due to the high similarity in appearance between the zooplankton and its background (e.g., marine snow). In this work, we present the ZooplanktonBench, a benchmark dataset containing images and videos of zooplankton associated with rich geospatial metadata (e.g., geographic coordinates, depth, etc.) in various water ecosystems. ZooplanktonBench defines a collection of tasks to detect, classify, and track zooplankton in challenging settings, including highly cluttered environments, living vs non-living classification, objects with similar shapes, and relatively small objects. Our dataset presents unique challenges and opportunities for state-of-the-art computer vision systems to evolve and improve visual understanding in dynamic environments characterized by significant variation and the need for geo-awareness. The code and settings described in this paper can be found on our website: https://lfk118.github.io/ZooplanktonBench_Webpage.

Paper Structure

This paper contains 15 sections, 6 figures, 6 tables.

Figures (6)

  • Figure 1: (a) The In Situ Ichthyoplankton Imaging System (ISIIS, left) on the deck of a research vessel. Plankton or particles that pass in between the bottom two pods are captured in the images as "shadows." (b) An example of an eel larva (Leptocephalus larva) captured in a container. (c) The same eel larva as viewed in a shadowgraph imaging system.
  • Figure 2: The region of sampling in the nGOM. Bathymetry contours correspond to 10-m intervals (20, 30, 40, and 50 m contours are labeled). The red line corresponds to the star-shaped sampling pattern between the 40 and 50-m isobaths, including crossings at three different depths (10, 25, and 35 m). The bottom right inset provides a 3D view of the sampling region.
  • Figure 3: Example photos of zooplankton in our ZooplanktonBench dataset.
  • Figure 4: Examples of zooplankton and marine snow. Marine snow is abundant and can take many different shapes. By chance, they can resemble living zooplankton in the images.
  • Figure 5: Image examples from 10 meters, 25 meters, and 35 meters. Each image is 13 cm by 13 cm and looks through 40 cm of ocean water (i.e., image depth of field).
  • ...and 1 more figures