Table of Contents
Fetching ...

Message Size Matters: AlterBFT's Approach to Practical Synchronous BFT in Public Clouds

Nenad Milošević, Daniel Cason, Zarko Milošević, Robert Soulé, Fernando Pedone

TL;DR

The paper addresses the latency-safety trade-off in Byzantine fault-tolerant consensus for public clouds by introducing a hybrid synchronous system model that separates small and large messages. AlterBFT leverages small, fast-coordinating messages to guarantee safety, while using large messages for value propagation under a GST-based bound to preserve liveness, achieving up to $15\times$ lower latency than synchronous contenders with comparable throughput and the same fault tolerance. A fast commit path, a refined equivocation-detection mechanism, and careful epoch-change timing further enhance performance, especially in failure-free scenarios after GST. Experimental evaluation across geo-distributed cloud deployments demonstrates substantial latency reductions for large blocks and competitive throughput relative to synchronous baselines, with robust behavior under equivocation attacks and scalable certificate management. The work has practical impact for blockchains in public clouds, enabling responsive consensus without sacrificing safety or increasing the required number of replicas beyond the traditional synchronous threshold.

Abstract

Synchronous consensus protocols offer a significant advantage over their asynchronous and partially synchronous counterparts by providing higher fault tolerance -- an essential benefit in distributed systems, like blockchains, where participants may have incentives to act maliciously. However, despite this advantage, synchronous protocols are often met with skepticism due to concerns about their performance, as the latency of synchronous protocols is tightly linked to a conservative time bound for message delivery. This paper introduces AlterBFT, a new Byzantine fault-tolerant consensus protocol. The key idea behind AlterBFT lies in the new model we propose, called hybrid synchronous system model. The new model is inspired by empirical observations about network behavior in the public cloud environment and combines elements from the synchronous and partially synchronous models. Namely, it distinguishes between small messages that respect time bounds and large messages that may violate bounds but are eventually timely. Leveraging this observation, AlterBFT achieves up to 15$\times$ lower latency than state-of-the-art synchronous protocols while maintaining similar throughput and the same fault tolerance. Compared to partially synchronous protocols, AlterBFT provides higher fault tolerance, higher throughput, and comparable latency.

Message Size Matters: AlterBFT's Approach to Practical Synchronous BFT in Public Clouds

TL;DR

The paper addresses the latency-safety trade-off in Byzantine fault-tolerant consensus for public clouds by introducing a hybrid synchronous system model that separates small and large messages. AlterBFT leverages small, fast-coordinating messages to guarantee safety, while using large messages for value propagation under a GST-based bound to preserve liveness, achieving up to lower latency than synchronous contenders with comparable throughput and the same fault tolerance. A fast commit path, a refined equivocation-detection mechanism, and careful epoch-change timing further enhance performance, especially in failure-free scenarios after GST. Experimental evaluation across geo-distributed cloud deployments demonstrates substantial latency reductions for large blocks and competitive throughput relative to synchronous baselines, with robust behavior under equivocation attacks and scalable certificate management. The work has practical impact for blockchains in public clouds, enabling responsive consensus without sacrificing safety or increasing the required number of replicas beyond the traditional synchronous threshold.

Abstract

Synchronous consensus protocols offer a significant advantage over their asynchronous and partially synchronous counterparts by providing higher fault tolerance -- an essential benefit in distributed systems, like blockchains, where participants may have incentives to act maliciously. However, despite this advantage, synchronous protocols are often met with skepticism due to concerns about their performance, as the latency of synchronous protocols is tightly linked to a conservative time bound for message delivery. This paper introduces AlterBFT, a new Byzantine fault-tolerant consensus protocol. The key idea behind AlterBFT lies in the new model we propose, called hybrid synchronous system model. The new model is inspired by empirical observations about network behavior in the public cloud environment and combines elements from the synchronous and partially synchronous models. Namely, it distinguishes between small messages that respect time bounds and large messages that may violate bounds but are eventually timely. Leveraging this observation, AlterBFT achieves up to 15 lower latency than state-of-the-art synchronous protocols while maintaining similar throughput and the same fault tolerance. Compared to partially synchronous protocols, AlterBFT provides higher fault tolerance, higher throughput, and comparable latency.

Paper Structure

This paper contains 38 sections, 12 theorems, 10 figures, 7 tables.

Key Result

Lemma 1

Every honest replica always progresses to the next epoch.

Figures (10)

  • Figure 1: Communication delays between two replicas located in the the same AWS region (N. Virginia).
  • Figure 2: Average latency (top) and throughput (bottom) comparison for all protocols when varying system size (i.e., 25 and 85 replicas) and block size (all graphs in log scale).
  • Figure 3: AlterBFT and FastAlterBFT throughput (top) and latency (bottom) under equivocation attack, 25 replicas and 128 KB blocks.
  • Figure 4: Performance comparison of synchronous consensus with chunked proposals (Chunked-HS), conservative bounds (Sync-HS), and AlterBFT for 25 replicas with 128 KB blocks.
  • Figure 5: Message delays between N. Virginia and S. Paulo (x-axis in log scale) when sending 128 KB messages (Non-Chopped) versus sending 64 2 KB messages (Chopped).
  • ...and 5 more figures

Theorems & Definitions (31)

  • Remark 3.1
  • Remark 4.1
  • Remark 5.1
  • Remark 5.2
  • Remark 5.3
  • Remark 5.4
  • Remark 5.5
  • Lemma 1
  • proof
  • Lemma 2
  • ...and 21 more