Privacy-Preserving Data Quality Assessment for Time-Series IoT Sensors
Novoneel Chakraborty, Abhay Sharma, Jyotirmoy Dutta, Hari Dilip Kumar
TL;DR
This work tackles the challenge of assessing data quality for time-series IoT data in smart cities while preserving privacy. It introduces a privacy-preserving, automated framework that relies on six metrics, derived from inter-arrival time analysis and schema validation, and executes computations inside a trusted execution environment to produce data-blind assessments. The approach enforces objective, normalised, and locally computable measurements ($M_1$–$M_6$) against a declarative JSON schema, enabling zero-trust collaboration between a dataset owner and a data consumer. Deployed in a real smart-city data exchange, the framework demonstrates scalable private evaluation and reporting to city administrators, with limitations tied to static datasets and enclave hardware constraints. The work advances practical, privacy-preserving data quality assessment for IoT at urban scale and points to near-term extensions in real-time monitoring and user-facing dashboards.
Abstract
Data from Internet of Things (IoT) sensors has emerged as a key contributor to decision-making processes in various domains. However, the quality of the data is crucial to the effectiveness of applications built on it, and assessment of the data quality is heavily context-dependent. Further, preserving the privacy of the data during quality assessment is critical in domains where sensitive data is prevalent. This paper proposes a novel framework for automated, objective, and privacy-preserving data quality assessment of time-series data from IoT sensors deployed in smart cities. We leverage custom, autonomously computable metrics that parameterise the temporal performance and adherence to a declarative schema document to achieve objectivity. Additionally, we utilise a trusted execution environment to create a "data-blind" model that ensures individual privacy, eliminates assessee bias, and enhances adaptability across data types. This paper describes this data quality assessment methodology for IoT sensors, emphasising its relevance within the smart-city context while addressing the growing need for privacy in the face of extensive data collection practices.
