STSeg-Complex Video Object Segmentation: The 1st Solution for 4th PVUW MOSE Challenge

Kehuan Song; Xinglin Xie; Kexin Zhang; Licheng Jiao; Lingling Li; Shuyuan Yang

STSeg-Complex Video Object Segmentation: The 1st Solution for 4th PVUW MOSE Challenge

Kehuan Song, Xinglin Xie, Kexin Zhang, Licheng Jiao, Lingling Li, Shuyuan Yang

TL;DR

The STSeg solution achieved a J&F score of 87.26% on the test set of the 2025 4th PVUW Challenge MOSE Track, securing the 1st place and advancing the technology for video object segmentation in complex scenarios.

Abstract

Segmentation of video objects in complex scenarios is highly challenging, and the MOSE dataset has significantly contributed to the development of this field. This technical report details the STSeg solution proposed by the "imaplus" team.By finetuning SAM2 and the unsupervised model TMO on the MOSE dataset, the STSeg solution demonstrates remarkable advantages in handling complex object motions and long-video sequences. In the inference phase, an Adaptive Pseudo-labels Guided Model Refinement Pipeline is adopted to intelligently select appropriate models for processing each video. Through finetuning the models and employing the Adaptive Pseudo-labels Guided Model Refinement Pipeline in the inference phase, the STSeg solution achieved a J&F score of 87.26% on the test set of the 2025 4th PVUW Challenge MOSE Track, securing the 1st place and advancing the technology for video object segmentation in complex scenarios.

STSeg-Complex Video Object Segmentation: The 1st Solution for 4th PVUW MOSE Challenge

TL;DR

Abstract

STSeg-Complex Video Object Segmentation: The 1st Solution for 4th PVUW MOSE Challenge

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)