Evaluating Fault Tolerance and Scalability in Distributed File Systems: A Case Study of GFS, HDFS, and MinIO
Shubham Malhotra, Fnu Yashu, Muhammad Saqib, Dipkumar Mehta, Jagdish Jangid, Sachin Dixit
TL;DR
This paper evaluates fault tolerance and scalability across three distributed file systems—GFS, HDFS, and MinIO—focusing on replication, erasure coding, and cloud-native deployment. It analyzes architectural choices, data locality, and disaster recovery mechanisms to understand how each system performs under large-scale workloads and dynamic demands. The study highlights practical guidelines for selecting a DFS based on enterprise needs, contrasting append-optimized, lease-based models with cloud-friendly object storage and data-processing partnerships. The findings emphasize that system selection should align with workload characteristics, whether prioritizing high availability, bulk analytics, or scalable cloud integration.
Abstract
Distributed File Systems (DFS) are essential for managing vast datasets across multiple servers, offering benefits in scalability, fault tolerance, and data accessibility. This paper presents a comprehensive evaluation of three prominent DFSs - Google File System (GFS), Hadoop Distributed File System (HDFS), and MinIO - focusing on their fault tolerance mechanisms and scalability under varying data loads and client demands. Through detailed analysis, how these systems handle data redundancy, server failures, and client access protocols, ensuring reliability in dynamic, large-scale environments is assessed. In addition, the impact of system design on performance, particularly in distributed cloud and computing architectures is assessed. By comparing the strengths and limitations of each DFS, the paper provides practical insights for selecting the most appropriate system for different enterprise needs, from high availability storage to big data analytics.
