FASTEN: Towards a FAult-tolerant and STorage EfficieNt Cloud: Balancing Between Replication and Deduplication
Sabbir Ahmed, Md Nahiduzzaman, Tariqul Islam, Faisal Haque Bappy, Tarannum Shaila Zaman, Raiful Hasan
TL;DR
FASTEN addresses the cloud storage challenge of balancing deduplication and replication to achieve storage efficiency and high availability. It introduces a data dispersal framework with optimum-server selection and maximum fault-tolerance subsets, coupled with a server-rating mechanism using a weighted score WR = $DD*a + S_l*b + Q_s*c + Dis*d + R_c*e$ and a Gale–Shapley–style DP to maximize placements. The system relies on convergent encryption for privacy, Merkle hash trees for integrity, and a HashMap-based indexing for efficient deduplication and auditing, including batch auditing. Experimental results on varying file/block sizes, redundancy factors, and server counts show FASTEN improves fault tolerance, reduces costs, and speeds block- and file-level deduplication and batch auditing relative to MHT-based and other dedup schemes.
Abstract
With the surge in cloud storage adoption, enterprises face challenges managing data duplication and exponential data growth. Deduplication mitigates redundancy, yet maintaining redundancy ensures high availability, incurring storage costs. Balancing these aspects is a significant research concern. We propose FASTEN, a distributed cloud storage scheme ensuring efficiency, security, and high availability. FASTEN achieves fault tolerance by dispersing data subsets optimally across servers and maintains redundancy for high availability. Experimental results show FASTEN's effectiveness in fault tolerance, cost reduction, batch auditing, and file and block-level deduplication. It outperforms existing systems with low time complexity, strong fault tolerance, and commendable deduplication performance.
