Table of Contents
Fetching ...

USIS-PGM: Photometric Gaussian Mixtures for Underwater Salient Instance Segmentation

Lin Hong, Xiangtong Yao, Mürüvvet Bozkurt, Xin Wang, Fumin Zhang

Abstract

Underwater salient instance segmentation (USIS) is crucial for marine robotic systems, as it enables both underwater salient object detection and instance-level mask prediction for visual scene understanding. Compared with its terrestrial counterpart, USIS is more challenging due to the underwater image degradation. To address this issue, this paper proposes USIS-PGM, a single-stage framework for USIS. Specifically, the encoder enhances boundary cues through a frequency-aware module and performs content-adaptive feature reweighting via a dynamic weighting module. The decoder incorporates a Transformer-based instance activation module to better distinguish salient instances. In addition, USIS-PGM employs multi-scale Gaussian heatmaps generated from ground-truth masks through Photometric Gaussian Mixture (PGM) to supervise intermediate decoder features, thereby improving salient instance localization and producing more structurally coherent mask predictions. Experimental results demonstrate the superiority and practical applicability of the proposed USIS-PGM model.

USIS-PGM: Photometric Gaussian Mixtures for Underwater Salient Instance Segmentation

Abstract

Underwater salient instance segmentation (USIS) is crucial for marine robotic systems, as it enables both underwater salient object detection and instance-level mask prediction for visual scene understanding. Compared with its terrestrial counterpart, USIS is more challenging due to the underwater image degradation. To address this issue, this paper proposes USIS-PGM, a single-stage framework for USIS. Specifically, the encoder enhances boundary cues through a frequency-aware module and performs content-adaptive feature reweighting via a dynamic weighting module. The decoder incorporates a Transformer-based instance activation module to better distinguish salient instances. In addition, USIS-PGM employs multi-scale Gaussian heatmaps generated from ground-truth masks through Photometric Gaussian Mixture (PGM) to supervise intermediate decoder features, thereby improving salient instance localization and producing more structurally coherent mask predictions. Experimental results demonstrate the superiority and practical applicability of the proposed USIS-PGM model.
Paper Structure (14 sections, 3 equations, 5 figures, 2 tables)

This paper contains 14 sections, 3 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Overall framework of the proposed USIS-PGM model, consisting of an encoder, a decoder, and the PGM-based multi-scale supervision strategy. The encoder integrates FAN and DWM to enhance boundary-aware representation and content-adaptive feature reweighting, while the decoder employs a Transformer-based IAM for accurate salient instance localization and mask prediction.
  • Figure 2: Visualization of Gaussian heatmaps generated from ground-truth mask based on $G(I;\lambda)$, where $\lambda=\{1,5,10,20\}$.
  • Figure 3: Qualitative evaluation of the proposed USIS-PGM model with 12 benchmark methods on the USIS16K dataset.
  • Figure 4: PR curve performance of the proposed USIS-PGM model and 12 benchmark methods on four representative object categories, i.e., (a) Diver, (b) Swimmer, (c) Plastic cup, and (d) Sea chest grating.
  • Figure 5: 3D reconstruction results of underwater objects, including (a) sacrificial anodes and (b) sea snails, obtained using USIS-PGM and COLMAP schonberger2016structure.