Sharding Distributed Databases: A Critical Review

Siamak Solat

Sharding Distributed Databases: A Critical Review

Siamak Solat

TL;DR

The paper addresses scalability bottlenecks in consensus-based distributed replication caused by message complexity, where classic PBFT exhibits $O(n^2)$ and Paxos/Raft $O(n)$ complexity, with view-change potentially raising to $O(f.n^3)$. It evaluates sharding as a parallelization strategy to mitigate these bottlenecks, analyzing both distributed databases and DLTs and distinguishing processing/sharding from storage/state sharding. It surveys major sharded implementations—Ethereum 2.0 (homogeneous multi-chain) and Polkadot (heterogeneous multi-chain)—and other sharded blockchains, as well as classic databases, highlighting shared-ledger scalability constraints and cross-shard transaction challenges, and discussing remedies such as graph-based approaches and Fisherman. The conclusions emphasize that sharding can approach linear scalability but requires careful handling of security, cross-shard coordination, and permissioning to be practically viable across different system classes.

Abstract

This article examines the significant challenges encountered in implementing sharding within distributed replication systems. It identifies the impediments of achieving consensus among large participant sets, leading to scalability, throughput, and performance limitations. These issues primarily arise due to the message complexity inherent in consensus mechanisms. In response, we investigate the potential of sharding to mitigate these challenges, analyzing current implementations within distributed replication systems. Additionally, we offer a comprehensive review of replication systems, encompassing both classical distributed databases as well as Distributed Ledger Technologies (DLTs) employing sharding techniques. Through this analysis, the article aims to provide insights into addressing the scalability and performance concerns in distributed replication systems.

Sharding Distributed Databases: A Critical Review

TL;DR

The paper addresses scalability bottlenecks in consensus-based distributed replication caused by message complexity, where classic PBFT exhibits

and Paxos/Raft

complexity, with view-change potentially raising to

. It evaluates sharding as a parallelization strategy to mitigate these bottlenecks, analyzing both distributed databases and DLTs and distinguishing processing/sharding from storage/state sharding. It surveys major sharded implementations—Ethereum 2.0 (homogeneous multi-chain) and Polkadot (heterogeneous multi-chain)—and other sharded blockchains, as well as classic databases, highlighting shared-ledger scalability constraints and cross-shard transaction challenges, and discussing remedies such as graph-based approaches and Fisherman. The conclusions emphasize that sharding can approach linear scalability but requires careful handling of security, cross-shard coordination, and permissioning to be practically viable across different system classes.

Abstract

Paper Structure (13 sections, 4 figures)

This paper contains 13 sections, 4 figures.

Introduction
Sharding at a Glance
Sharding Challenges
Distributing Nodes Between Shards:
Transactions Processing in Sharded DLTs:
Challenges With Shared Ledger Among Shards:
Challenges With Cross-Shard Transactions:
Overview of Sharding in Distributed Systems
Ethereum 2.0: Homogeneous Multi-Chain
Polkadot: Heterogeneous Multi-Chain
Other Sharded Blockchains
Sharding in Classic Distributed Databases
Conclusion

Figures (4)

Figure 1: The PBFT consensus message complexity where a primary node fails and a change-view with additional message exchange is required, so that for $f$ leader failures the message complexity increases to $O(f.n^3)$.
Figure 2: Throughput of a network that uses Paxos or PBFT consensus decreases drastically, as the number of nodes increases blockchain_consensus_protocols.
Figure 3: Most current sharding protocols use a random assignment approach for allocating and distributing nodes between shards due to security reasons.
Figure 10: A high-level view of the Polkadot architecture.

Sharding Distributed Databases: A Critical Review

TL;DR

Abstract

Sharding Distributed Databases: A Critical Review

Authors

TL;DR

Abstract

Table of Contents

Figures (4)