MPT: A Large-scale Multi-Phytoplankton Tracking Benchmark
Yang Yu, Yuezun Li, Xin Sun, Junyu Dong
TL;DR
This work addresses the challenge of real-time plankton monitoring by introducing MPT, a large-scale synthetic video benchmark with 140 4K sequences across 27 species and 14 backgrounds, enabling robust evaluation of multi-object tracking in underwater environments. It also presents DSFT, a Deviation-Corrected Multi-Scale Feature Fusion tracker that combines a residual-predicting auxiliary extractor (DCM) with multi-scale feature fusion (MFSF) to mitigate focus shifts and the loss of small-object information during tracking. The authors validate MPT and DSFT through extensive experiments and ablations, showing substantial improvements over baselines and establishing a practical framework for real-time phytoplankton observation and monitoring. Overall, MPT provides a versatile resource bridging detection and tracking in marine contexts, while DSFT offers a specialized online MOT solution tailored to the unique challenges of plankton data and underwater backgrounds.
Abstract
Phytoplankton are a crucial component of aquatic ecosystems, and effective monitoring of them can provide valuable insights into ocean environments and ecosystem changes. Traditional phytoplankton monitoring methods are often complex and lack timely analysis. Therefore, deep learning algorithms offer a promising approach for automated phytoplankton monitoring. However, the lack of large-scale, high-quality training samples has become a major bottleneck in advancing phytoplankton tracking. In this paper, we propose a challenging benchmark dataset, Multiple Phytoplankton Tracking (MPT), which covers diverse background information and variations in motion during observation. The dataset includes 27 species of phytoplankton and zooplankton, 14 different backgrounds to simulate diverse and complex underwater environments, and a total of 140 videos. To enable accurate real-time observation of phytoplankton, we introduce a multi-object tracking method, Deviation-Corrected Multi-Scale Feature Fusion Tracker(DSFT), which addresses issues such as focus shifts during tracking and the loss of small target information when computing frame-to-frame similarity. Specifically, we introduce an additional feature extractor to predict the residuals of the standard feature extractor's output, and compute multi-scale frame-to-frame similarity based on features from different layers of the extractor. Extensive experiments on the MPT have demonstrated the validity of the dataset and the superiority of DSFT in tracking phytoplankton, providing an effective solution for phytoplankton monitoring.
