HyRES: A Hybrid Replication and Erasure Coding Approach to Data Storage
Daniel E. Lucani, Marcell Fehér
TL;DR
HyRES introduces a flexible hybrid redundancy framework that blends replication, MDS erasure coding, and local repairability to trade storage costs, robustness, and repair traffic in distributed storage. It provides a mathematical framework and closed-form results to compare HyRES against pure replication and MDS codes under network-size and loss-event considerations, and validates findings with simulations. The results demonstrate that HyRES can reduce overall storage costs and file-loss probability while maintaining competitive repair traffic, effectively generalizing prior hybrid approaches. This approach offers practical gains for managing hot and cold data in large-scale distributed storage systems and motivates network-size-aware design choices for redundancy strategies.
Abstract
Reliability in distributed storage systems has typically focused on the design and deployment of data replication or erasure coding techniques. Although some scenarios have considered the use of replication for hot data and erasure coding for cold data in the same system, each is designed in isolation. We propose HyRES, a hybrid scheme incorporates the best characteristics of each scheme, thus, resulting in additional design flexibility and better potential performance for the system. We show that HyRES generalizes previously proposed hybrid schemes. We characterize the theoretical performance of HyRES as well as that of replication and erasure coding considering the effects of the size of the storage networks. We validate our theoretical results using simulations. These results show that HyRES can yield simultaneously lower storage costs than replication, lower probabilities of file loss than replication and erasure coding with similar worst case performance, and even lower effective repair traffic than replication when considering the network size.
