Table of Contents
Fetching ...

Unsupervised Region-Growing Network for Object Segmentation in Atmospheric Turbulence

Dehao Qin, Ripon Saha, Suren Jayasuriya, Jinwei Ye, Nianyi Li

TL;DR

The paper tackles moving object segmentation in long-range videos degraded by atmospheric turbulence using an unsupervised approach. It combines a geometry-based motion disentanglement step that leverages the $Sampson$ distance with epipolar constraints, a detect-then-grow region-growing scheme to produce per-object masks, and a Refine-Net trained with bidirectional spatio-temporal losses to enforce temporal and spatial consistency. The authors introduce the DOST dataset, a real long-range turbulent video collection with ground-truth masks, and demonstrate that the proposed method outperforms prior unsupervised methods across varying turbulence strengths, while remaining robust to camera shake. Limitations include a reported latency of about 0.95 FPS and challenges with overlapping objects, with future work aimed at speeding up inference and incorporating additional cues such as appearance or saliency, potentially aided by foundation models.

Abstract

Moving object segmentation in the presence of atmospheric turbulence is highly challenging due to turbulence-induced irregular and time-varying distortions. In this paper, we present an unsupervised approach for segmenting moving objects in videos downgraded by atmospheric turbulence. Our key approach is a detect-then-grow scheme: we first identify a small set of moving object pixels with high confidence, then gradually grow a foreground mask from those seeds to segment all moving objects. This method leverages rigid geometric consistency among video frames to disentangle different types of motions, and then uses the Sampson distance to initialize the seedling pixels. After growing per-frame foreground masks, we use spatial grouping loss and temporal consistency loss to further refine the masks in order to ensure their spatio-temporal consistency. Our method is unsupervised and does not require training on labeled data. For validation, we collect and release the first real-captured long-range turbulent video dataset with ground truth masks for moving objects. Results show that our method achieves good accuracy in segmenting moving objects and is robust for long-range videos with various turbulence strengths.

Unsupervised Region-Growing Network for Object Segmentation in Atmospheric Turbulence

TL;DR

The paper tackles moving object segmentation in long-range videos degraded by atmospheric turbulence using an unsupervised approach. It combines a geometry-based motion disentanglement step that leverages the distance with epipolar constraints, a detect-then-grow region-growing scheme to produce per-object masks, and a Refine-Net trained with bidirectional spatio-temporal losses to enforce temporal and spatial consistency. The authors introduce the DOST dataset, a real long-range turbulent video collection with ground-truth masks, and demonstrate that the proposed method outperforms prior unsupervised methods across varying turbulence strengths, while remaining robust to camera shake. Limitations include a reported latency of about 0.95 FPS and challenges with overlapping objects, with future work aimed at speeding up inference and incorporating additional cues such as appearance or saliency, potentially aided by foundation models.

Abstract

Moving object segmentation in the presence of atmospheric turbulence is highly challenging due to turbulence-induced irregular and time-varying distortions. In this paper, we present an unsupervised approach for segmenting moving objects in videos downgraded by atmospheric turbulence. Our key approach is a detect-then-grow scheme: we first identify a small set of moving object pixels with high confidence, then gradually grow a foreground mask from those seeds to segment all moving objects. This method leverages rigid geometric consistency among video frames to disentangle different types of motions, and then uses the Sampson distance to initialize the seedling pixels. After growing per-frame foreground masks, we use spatial grouping loss and temporal consistency loss to further refine the masks in order to ensure their spatio-temporal consistency. Our method is unsupervised and does not require training on labeled data. For validation, we collect and release the first real-captured long-range turbulent video dataset with ground truth masks for moving objects. Results show that our method achieves good accuracy in segmenting moving objects and is robust for long-range videos with various turbulence strengths.
Paper Structure (12 sections, 9 equations, 8 figures, 2 tables)

This paper contains 12 sections, 9 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Our method robustly segments moving objects under various turbulence strengths, while state-of-the-art methods may fail under strong turbulence (2nd video).
  • Figure 2: Overall pipeline of our unsupervised motion segmentation method. We first generate motion feature maps by applying a geometry-based consistency check on optical flows. We then adopt a region-growing scheme to generate coarse segmentation masks. Finally, we refine the masks using cross entropy-based consistency losses to enforce their spatio-temporal consistency.
  • Figure 3: Pipeline of epipolar geometry-based motion disentanglement. Since the raw optical flows are downgraded by turbulence, we apply a geometry-based consistency check to generate motion feature maps that only preserve object motion.
  • Figure 4: Step-by-step intermediate results for motion feature map estimation.
  • Figure 5: Pipeline of region growing-based segmentation. We select seeds on the motion feature map using a sliding window. We then grow the seeds to full segmentation masks.
  • ...and 3 more figures