WSESeg: Introducing a Dataset for the Segmentation of Winter Sports Equipment with a Baseline for Interactive Segmentation
Robin Schön, Daniel Kienzle, Rainer Lienhart
TL;DR
This work addresses the scarcity of high-quality segmentation data for winter sports equipment by introducing WSESeg, a dataset with 7,452 masks across 4,221 images spanning 10 classes. It evaluates Segment Anything Model (SAM) and HQ-SAM as interactive segmentation baselines on this domain and investigates test-time adaptation strategies (e.g., Click Adaptation, Click Mask, and mask-based ground-truth refinement) to reduce the number of user interactions and failure rate, reporting metrics NoC$_{20}@85$ and FR$_{20}@85$ with IoU threshold $0.85$. The study demonstrates that online adaptation can yield meaningful improvements, with SAM showing moderate gains and HQ-SAM benefiting more from adaptation, though some classes still pose challenges. The work contributions include a new domain-specific dataset, an empirical assessment of foundation models in this niche, and practical adaptation techniques that enhance interactive segmentation efficiency in winter-sports analytics.
Abstract
In this paper we introduce a new dataset containing instance segmentation masks for ten different categories of winter sports equipment, called WSESeg (Winter Sports Equipment Segmentation). Furthermore, we carry out interactive segmentation experiments on said dataset to explore possibilities for efficient further labeling. The SAM and HQ-SAM models are conceptualized as foundation models for performing user guided segmentation. In order to measure their claimed generalization capability we evaluate them on WSESeg. Since interactive segmentation offers the benefit of creating easily exploitable ground truth data during test-time, we are going to test various online adaptation methods for the purpose of exploring potentials for improvements without having to fine-tune the models explicitly. Our experiments show that our adaptation methods drastically reduce the Failure Rate (FR) and Number of Clicks (NoC) metrics, which generally leads faster to better interactive segmentation results.
