Tri-Select: A Multi-Stage Visual Data Selection Framework for Mobile Visual Crowdsensing
Jiayu Zhang, Kaixing Zhao, Tianhao Shao, Bin Guo, Liang He
TL;DR
Tri-Select introduces a three-stage visual data selection framework for mobile crowdsensing to tackle redundancy and heterogeneity in large-scale image collections. It integrates metadata-based pre-filtering, spectral clustering of spatial/directional features, and SIFT-based MIS-driven visual selection to produce a compact, representative subset. Experimental results demonstrate substantial data reduction, coherent clustering, and high-quality, non-redundant selections with favorable efficiency compared to baselines, including edge-deployable performance. The approach offers scalable, modular data selection suited for real-time and large-scale MVC deployments, with potential extensions to semantic features and task-specific constraints.
Abstract
Mobile visual crowdsensing enables large-scale, fine-grained environmental monitoring through the collection of images from distributed mobile devices. However, the resulting data is often redundant and heterogeneous due to overlapping acquisition perspectives, varying resolutions, and diverse user behaviors. To address these challenges, this paper proposes Tri-Select, a multi-stage visual data selection framework that efficiently filters redundant and low-quality images. Tri-Select operates in three stages: (1) metadata-based filtering to discard irrelevant samples; (2) spatial similarity-based spectral clustering to organize candidate images; and (3) a visual-feature-guided selection based on maximum independent set search to retain high-quality, representative images. Experiments on real-world and public datasets demonstrate that Tri-Select improves both selection efficiency and dataset quality, making it well-suited for scalable crowdsensing applications.
