Lightweight Trustworthy Distributed Clustering
Hongyang Li, Caesar Wu, Mohammed Chadli, Said Mammar, Pascal Bouvry
TL;DR
This work addresses trustworthy distributed clustering in resource-constrained edge systems by proposing a lightweight, fully distributed $k$-means algorithm that preserves clustering accuracy while protecting private node data during cluster-center updates via additive secret sharing. The method maintains identical outcomes to standard $k$-means, with information-theoretic privacy under an honest-majority model and lower computational and communication overhead compared to server-centric MPC approaches. Key contributions include a building block for secure distributed averaging, a privacy-preserving center-update mechanism using extended per-cluster vectors, and rigorous per-iteration and cross-iteration privacy analyses. The approach enables secure, scalable, real-time clustering in ECS without relying on trusted third parties, with future work aimed at relaxing the honest-majority assumption and validating on real-edge datasets.
Abstract
Ensuring data trustworthiness within individual edge nodes while facilitating collaborative data processing poses a critical challenge in edge computing systems (ECS), particularly in resource-constrained scenarios such as autonomous systems sensor networks, industrial IoT, and smart cities. This paper presents a lightweight, fully distributed k-means clustering algorithm specifically adapted for edge environments, leveraging a distributed averaging approach with additive secret sharing, a secure multiparty computation technique, during the cluster center update phase to ensure the accuracy and trustworthiness of data across nodes.
