Unsupervised Region-Growing Network for Object Segmentation in Atmospheric Turbulence
Dehao Qin, Ripon Saha, Suren Jayasuriya, Jinwei Ye, Nianyi Li
TL;DR
The paper tackles moving object segmentation in long-range videos degraded by atmospheric turbulence using an unsupervised approach. It combines a geometry-based motion disentanglement step that leverages the $Sampson$ distance with epipolar constraints, a detect-then-grow region-growing scheme to produce per-object masks, and a Refine-Net trained with bidirectional spatio-temporal losses to enforce temporal and spatial consistency. The authors introduce the DOST dataset, a real long-range turbulent video collection with ground-truth masks, and demonstrate that the proposed method outperforms prior unsupervised methods across varying turbulence strengths, while remaining robust to camera shake. Limitations include a reported latency of about 0.95 FPS and challenges with overlapping objects, with future work aimed at speeding up inference and incorporating additional cues such as appearance or saliency, potentially aided by foundation models.
Abstract
Moving object segmentation in the presence of atmospheric turbulence is highly challenging due to turbulence-induced irregular and time-varying distortions. In this paper, we present an unsupervised approach for segmenting moving objects in videos downgraded by atmospheric turbulence. Our key approach is a detect-then-grow scheme: we first identify a small set of moving object pixels with high confidence, then gradually grow a foreground mask from those seeds to segment all moving objects. This method leverages rigid geometric consistency among video frames to disentangle different types of motions, and then uses the Sampson distance to initialize the seedling pixels. After growing per-frame foreground masks, we use spatial grouping loss and temporal consistency loss to further refine the masks in order to ensure their spatio-temporal consistency. Our method is unsupervised and does not require training on labeled data. For validation, we collect and release the first real-captured long-range turbulent video dataset with ground truth masks for moving objects. Results show that our method achieves good accuracy in segmenting moving objects and is robust for long-range videos with various turbulence strengths.
