Table of Contents
Fetching ...

Optimal Sharding for Scalable Blockchains with Deconstructed SMR

Jianting Zhang, Zhongtang Luo, Raghavendra Ramesh, Aniket Kate

TL;DR

Arete addresses blockchain scalability by resolving the size-security dilemma through a deconstructed SMR that decouples consensus from processing. It uses one ordering shard and multiple processing shards, with safety-liveness separation enabling a higher per-task fault tolerance and enabling many small shards while maintaining security, implemented via a certify-order-execute COE architecture. The system supports asynchronous cross-shard execution with lock-free guarantees and a shard reconfiguration mechanism to recover liveness-violated shards, yielding near-linear scalability. Empirical evaluation on AWS with up to 500 nodes demonstrates Arete delivering around 180K TPS and dramatically lower cross-shard latency than prior protocols, while maintaining $0.9999$ probabilistic liveness.

Abstract

Sharding is proposed to enhance blockchain scalability. However, a size-security dilemma where every shard must be large enough to ensure its security constrains the efficacy of individual shards and the degree of sharding itself. Most existing sharding solutions therefore rely on either weakening the adversary or making stronger assumptions on network links. This paper presents Arete, an optimally scalable blockchain sharding protocol designed to resolve the dilemma based on an observation that if individual shards can tolerate a higher fraction of (Byzantine) faults, we can securely create smaller shards in a larger quantity. The key idea of Arete, therefore, is to improve the security resilience/threshold of shards by dividing the blockchain's State Machine Replication (SMR) process itself. Similar to modern blockchains, Arete first decouples SMR in three steps: transaction dissemination, ordering, and execution. However, unlike other blockchains, for Arete, a single ordering shard performs the ordering task while multiple processing shards perform the dissemination and execution of blocks. As processing shards do not run consensus, each of those can tolerate up to half compromised nodes. Moreover, the SMR process in the ordering shard is lightweight as it only operates on the block digests. Second, Arete considers safety and liveness against Byzantine failures separately to improve the safety threshold further while tolerating temporary liveness violations in a controlled manner. Apart from the creation of more optimal-size shards, such a deconstructed SMR scheme also empowers us to devise a novel certify-order-execute architecture to fully parallelize transaction handling, thereby improving the performance of sharding systems. We implement Arete and evaluate it on a AWS environment by running up to 500 nodes, showing that Arete outperforms the state-of-the-art sharding protocols.

Optimal Sharding for Scalable Blockchains with Deconstructed SMR

TL;DR

Arete addresses blockchain scalability by resolving the size-security dilemma through a deconstructed SMR that decouples consensus from processing. It uses one ordering shard and multiple processing shards, with safety-liveness separation enabling a higher per-task fault tolerance and enabling many small shards while maintaining security, implemented via a certify-order-execute COE architecture. The system supports asynchronous cross-shard execution with lock-free guarantees and a shard reconfiguration mechanism to recover liveness-violated shards, yielding near-linear scalability. Empirical evaluation on AWS with up to 500 nodes demonstrates Arete delivering around 180K TPS and dramatically lower cross-shard latency than prior protocols, while maintaining probabilistic liveness.

Abstract

Sharding is proposed to enhance blockchain scalability. However, a size-security dilemma where every shard must be large enough to ensure its security constrains the efficacy of individual shards and the degree of sharding itself. Most existing sharding solutions therefore rely on either weakening the adversary or making stronger assumptions on network links. This paper presents Arete, an optimally scalable blockchain sharding protocol designed to resolve the dilemma based on an observation that if individual shards can tolerate a higher fraction of (Byzantine) faults, we can securely create smaller shards in a larger quantity. The key idea of Arete, therefore, is to improve the security resilience/threshold of shards by dividing the blockchain's State Machine Replication (SMR) process itself. Similar to modern blockchains, Arete first decouples SMR in three steps: transaction dissemination, ordering, and execution. However, unlike other blockchains, for Arete, a single ordering shard performs the ordering task while multiple processing shards perform the dissemination and execution of blocks. As processing shards do not run consensus, each of those can tolerate up to half compromised nodes. Moreover, the SMR process in the ordering shard is lightweight as it only operates on the block digests. Second, Arete considers safety and liveness against Byzantine failures separately to improve the safety threshold further while tolerating temporary liveness violations in a controlled manner. Apart from the creation of more optimal-size shards, such a deconstructed SMR scheme also empowers us to devise a novel certify-order-execute architecture to fully parallelize transaction handling, thereby improving the performance of sharding systems. We implement Arete and evaluate it on a AWS environment by running up to 500 nodes, showing that Arete outperforms the state-of-the-art sharding protocols.
Paper Structure (33 sections, 12 theorems, 3 equations, 12 figures, 2 tables, 4 algorithms)

This paper contains 33 sections, 12 theorems, 3 equations, 12 figures, 2 tables, 4 algorithms.

Key Result

Lemma 1

Given a processing shard with $|S_P^{\mathit{sid}}|$ nodes and a safety threshold $f_S$, any honest node can recover an intact execution block $\mathit{EB_1}$ if $\mathit{EB_1}$'s associated certificate block contains at least $f_S\cdot |S_P^{\mathit{sid}}|+1$ signatures from distinct nodes of $S_P^

Figures (12)

  • Figure 1: Arete overview: the system is divided into one ordering shard and $k$ processing shards $\{S_P^1 \cdots S_P^{k}\}$. The ordering shard runs a BFT consensus to globally order transactions, tolerating up to $f_S=f_L<1/3$ Byzantine nodes. A processing shard performs the data dissemination and execution tasks, tolerating up to $f_S\geq1/2$ Byzantine nodes.
  • Figure 2: The COE architecture overview: ① The CERTIFY stage: each processing shard disseminates and generates certified messages, performing the data dissemination task. ② The ORDER stage: the ordering shard performs the ordering task to establish a global order. ③ The EXECUTE stage: each processing shard executes and finalizes the ordered transactions.
  • Figure 3: The workflow of handling cross-shard transactions in Arete: (a) Two atomic swap transactions $ctx_1$ and $ctx_2$, both are cross-shard and involve contract $TK_1$ managed by shard $S_P^1$ and $TK_2$ managed by $S_P^2$. (b) Arete handles cross-shard transactions $ctx_1$ and $ctx_2$ via a lock-free execution approach.
  • Figure 4: Throughput-Nodes
  • Figure 5: End-to-end latency-Nodes
  • ...and 7 more figures

Theorems & Definitions (19)

  • Definition 1: Safety
  • Definition 2: Liveness
  • Definition 3: Safety Threshold $f_S$
  • Definition 4: Liveness Threshold $f_L$
  • Definition 5: Sharding Safety
  • Definition 6: Sharding Liveness
  • Definition 7: $\mathcal{P}$-probabilistic liveness
  • Lemma 1: Data Availability
  • Lemma 2
  • Lemma 3: Cross-shard Atomicity
  • ...and 9 more