LSVOS Challenge Report: Large-scale Complex and Long Video Object Segmentation
Henghui Ding, Lingyi Hong, Chang Liu, Ning Xu, Linjie Yang, Yuchen Fan, Deshui Miao, Yameng Gu, Xin Li, Zhenyu He, Yaowei Wang, Ming-Hsuan Yang, Jinming Chai, Qin Ma, Junpei Zhang, Licheng Jiao, Fang Liu, Xinyu Liu, Jing Zhang, Kexin Zhang, Xu Liu, LingLing Li, Hao Fang, Feiyu Pan, Xiankai Lu, Wei Zhang, Runmin Cong, Tuyen Tran, Bin Cao, Yisi Zhang, Hanyi Wang, Xingjian He, Jing Liu
TL;DR
The paper presents the 6th LSVOS Challenge, addressing the gap between benchmark performance and real-world video complexity by introducing VOS and RVOS tasks evaluated on the MOSE, LVOS, and MeViS datasets. It highlights diverse, memory-augmented and promptable segmentation approaches, including UNINEXT-, MUTR-, Grounding DINO-, and HQ-SAM–based pipelines, as well as SAM2-inspired memory mechanisms. The report analyzes top-performing methods across tracks, demonstrates substantial participation (129 teams), and discusses progress and remaining challenges in long-term temporal coherence, multi-object scenarios, and motion-rich references. Overall, the challenge emphasizes memory, cross-modal cues, and promptable segmentation as key drivers toward scalable, real-world video object segmentation.
Abstract
Despite the promising performance of current video segmentation models on existing benchmarks, these models still struggle with complex scenes. In this paper, we introduce the 6th Large-scale Video Object Segmentation (LSVOS) challenge in conjunction with ECCV 2024 workshop. This year's challenge includes two tasks: Video Object Segmentation (VOS) and Referring Video Object Segmentation (RVOS). In this year, we replace the classic YouTube-VOS and YouTube-RVOS benchmark with latest datasets MOSE, LVOS, and MeViS to assess VOS under more challenging complex environments. This year's challenge attracted 129 registered teams from more than 20 institutes across over 8 countries. This report include the challenge and dataset introduction, and the methods used by top 7 teams in two tracks. More details can be found in our homepage https://lsvos.github.io/.
