PM-Dedup: Secure Deduplication with Partial Migration from Cloud to Edge Servers
Zhaokang Ke, Haoyu Gong, David H. C. Du
TL;DR
The paper addresses the latency and cloud-overhead challenges of secure encrypted deduplication in cloud storage by migrating key deduplication checks and PoW tasks to edge SGX enclaves. PM-Dedup employs a cloud-maintained full-index and a centralized key server for server-aided MLE, while edge servers hold a locally optimized share-index and pre-computed PoW data, significantly reducing cloud interactions. Its contributions include CMS-based share-index selection augmented by logical locality, a dual-level PoW scheme with pre-computed challenges, SGX-based migration design, and an end-to-end implementation with extensive evaluation on real-world datasets showing substantial latency reductions and improved dedup efficiency. The work demonstrates practical, scalable secure deduplication for organizations with distributed branches, balancing security guarantees with performance gains in edge-assisted architectures.
Abstract
Currently, an increasing number of users and enterprises are storing their data in the cloud but do not fully trust cloud providers with their data in plaintext form. To address this concern, they encrypt their data before uploading it to the cloud. However, encryption with different keys means that even identical data will become different ciphertexts, making deduplication less effective. Encrypted deduplication avoids this issue by ensuring that identical data chunks generate the same ciphertext with content-based keys, enabling the cloud to efficiently identify and remove duplicates even in encrypted form. Current encrypted data deduplication work can be classified into two types: target-based and source-based. Target-based encrypted deduplication requires clients to upload all encrypted chunks (the basic unit of deduplication) to the cloud with high network bandwidth overhead. Source-based deduplication involves clients uploading fingerprints (hashes) of encrypted chunks for duplicate checking and only uploading unique encrypted chunks, which reduces network transfer but introduces high latency and potential side-channel attacks, which need to be mitigated by Proof of Ownership (PoW), and high computing overhead of the cloud. So, reducing the latency and the overheads of network and cloud while ensuring security has become a significant challenge for secure data deduplication in cloud storage. In response to this challenge, we present PM-Dedup, a novel secure source-based deduplication approach that relocates a portion of the deduplication checking process and PoW tasks from the cloud to the trusted execution environments (TEEs) in the client-side edge servers. We also propose various designs to enhance the security and efficiency of data deduplication.
