Can We Enhance the Quality of Mobile Crowdsensing Data Without Ground Truth?
Jiajie Li, Bo Gu, Shimin Gong, Zhou Su, Mohsen Guizani
TL;DR
This paper tackles the challenge of evaluating and improving mobile crowdsensing data quality in the absence of ground truth. It introduces PRBTD, a framework that combines a correlation-focused spatio-temporal Transformer (CFSTTN) for ground-truth prediction, a data-implication analysis to stabilize quality assessment, and a reputation-based truth discovery module to identify low-quality data and malicious users. By integrating prediction, implications, and reputations, PRBTD achieves higher data quality and better malicious-MU detection, even in scenarios with noisy history, bursty values, or sparse data, as demonstrated on the TaxiBJ dataset. The approach offers practical value for real-time MCS deployments, enabling more reliable sensing data and smarter incentive and governance mechanisms. Key contributions include the CFSTTN-based prediction, the implication-driven data features, and the iterative reputation-based TD with data-implied feedback loops.
Abstract
Mobile crowdsensing (MCS) has emerged as a prominent trend across various domains. However, ensuring the quality of the sensing data submitted by mobile users (MUs) remains a complex and challenging problem. To address this challenge, an advanced method is needed to detect low-quality sensing data and identify malicious MUs that may disrupt the normal operations of an MCS system. Therefore, this article proposes a prediction- and reputation-based truth discovery (PRBTD) framework, which can separate low-quality data from high-quality data in sensing tasks. First, we apply a correlation-focused spatio-temporal Transformer network that learns from the historical sensing data and predicts the ground truth of the data submitted by MUs. However, due to the noise in historical data for training and the bursty values within sensing data, the prediction results can be inaccurate. To address this issue, we use the implications among the sensing data, which are learned from the prediction results but are stable and less affected by inaccurate predictions, to evaluate the quality of the data. Finally, we design a reputation-based truth discovery (TD) module for identifying low-quality data with their implications. Given the sensing data submitted by MUs, PRBTD can eliminate the data with heavy noise and identify malicious MUs with high accuracy. Extensive experimental results demonstrate that the PRBTD method outperforms existing methods in terms of identification accuracy and data quality enhancement.
