SparseUWSeg: Active Sparse Point-Label Augmentation for Underwater Semantic Segmentation
César Borja, Carlos Plou, Rubén Martinez-Cantín, Ana C. Murillo
TL;DR
SparseUWSeg tackles the challenge of generating dense underwater semantic segmentations from scarce expert annotations by coupling an active point-selection strategy with a hybrid label augmentation pipeline that merges SAM2-based masks and PLAS superpixel propagation. The method defines an acquisition function that balances proximity to object centroids and coverage, and propagates seeds through a two-stage augmentation to yield dense masks with full-image coverage. Across UCSD Mosaics and SUIM, SparseUWSeg outperforms state-of-the-art sparse-label augmentation baselines, achieving consistent gains in masked and unmasked metrics, particularly at smaller budgets and with active sampling. The work also releases an interactive annotation tool to help ecology researchers efficiently generate high-quality segmentation masks, bridging foundation-model capabilities with domain-specific marine imagery analysis.
Abstract
Semantic segmentation is essential to automate underwater imagery analysis with ecology monitoring purposes. Unfortunately, fine grained underwater scene analysis is still an open problem even for top performing segmentation models. The high cost of obtaining dense, expert-annotated, segmentation labels hinders the supervision of models in this domain. While sparse point-labels are easier to obtain, they introduce challenges regarding which points to annotate and how to propagate the sparse information. We present SparseUWSeg, a novel framework that addresses both issues. SparseUWSeg employs an active sampling strategy to guide annotators, maximizing the value of their point labels. Then, it propagates these sparse labels with a hybrid approach leverages both the best of SAM2 and superpixel-based methods. Experiments on two diverse underwater datasets demonstrate the benefits of SparseUWSeg over state-of-the-art approaches, achieving up to +5\% mIoU over D+NN. Our main contribution is the design and release of a simple but effective interactive annotation tool, integrating our algorithms. It enables ecology researchers to leverage foundation models and computer vision to efficiently generate high-quality segmentation masks to process their data.
