Table of Contents
Fetching ...

Raptr: Prefix Consensus for Robust High-Performance BFT

Andrei Tonkikh, Balaji Arun, Zhuolun Xiang, Zekun Li, Alexander Spiegelman

TL;DR

Raptr tackles the classic latency-throughput-robustness tradeoff in Byzantine fault-tolerant SMR by introducing Prefix Consensus, a mechanism that allows voting on prefixes of a block and committing sub-block prefixes to maintain progress even when some data is missing. It builds on a latency-optimized Jolteon* baseline with Quorum Store, and integrates data availability proofs into the consensus path to achieve high throughput without sacrificing safety, while enabling rapid recovery under faults via timeout certificates. The key contributions include the novel prefix-containment safety framework, decoupled availability and safety quorums, and sub-blocks with efficient aggregate signatures and no-commit proofs, yielding up to 260k TPS with sub-second latency in geo-distributed tests and robust behavior under network glitches. Practically, Raptr combines the throughput advantages of DAG-like dissemination with the low-latency guarantees of leader-based protocols, offering a scalable, robust BFT solution for large-scale blockchain systems.

Abstract

In this paper, we present Raptr--a Byzantine fault-tolerant state machine replication (BFT SMR) protocol that combines strong robustness with high throughput, while attaining near-optimal theoretical latency. Raptr delivers exceptionally low latency and high throughput under favorable conditions, and it degrades gracefully in the presence of Byzantine faults and network attacks. Existing high-throughput BFT SMR protocols typically take either pessimistic or optimistic approaches to data dissemination: the former suffers from suboptimal latency in favorable conditions, while the latter deteriorates sharply under minimal attacks or network instability. Raptr bridges this gap, combining the strengths of both approaches through a novel Prefix Consensus mechanism. We implement Raptr and evaluate it against several state-of-the-art protocols in a geo-distributed environment with 100 replicas. Raptr achieves 260,000 transactions per second (TPS) with sub-second latency under favorable conditions, sustaining 610ms at 10,000 TPS and 755ms at 250,000 TPS. It remains robust under network glitches, showing minimal performance degradation even with a 1% message drop rate.

Raptr: Prefix Consensus for Robust High-Performance BFT

TL;DR

Raptr tackles the classic latency-throughput-robustness tradeoff in Byzantine fault-tolerant SMR by introducing Prefix Consensus, a mechanism that allows voting on prefixes of a block and committing sub-block prefixes to maintain progress even when some data is missing. It builds on a latency-optimized Jolteon* baseline with Quorum Store, and integrates data availability proofs into the consensus path to achieve high throughput without sacrificing safety, while enabling rapid recovery under faults via timeout certificates. The key contributions include the novel prefix-containment safety framework, decoupled availability and safety quorums, and sub-blocks with efficient aggregate signatures and no-commit proofs, yielding up to 260k TPS with sub-second latency in geo-distributed tests and robust behavior under network glitches. Practically, Raptr combines the throughput advantages of DAG-like dissemination with the low-latency guarantees of leader-based protocols, offering a scalable, robust BFT solution for large-scale blockchain systems.

Abstract

In this paper, we present Raptr--a Byzantine fault-tolerant state machine replication (BFT SMR) protocol that combines strong robustness with high throughput, while attaining near-optimal theoretical latency. Raptr delivers exceptionally low latency and high throughput under favorable conditions, and it degrades gracefully in the presence of Byzantine faults and network attacks. Existing high-throughput BFT SMR protocols typically take either pessimistic or optimistic approaches to data dissemination: the former suffers from suboptimal latency in favorable conditions, while the latter deteriorates sharply under minimal attacks or network instability. Raptr bridges this gap, combining the strengths of both approaches through a novel Prefix Consensus mechanism. We implement Raptr and evaluate it against several state-of-the-art protocols in a geo-distributed environment with 100 replicas. Raptr achieves 260,000 transactions per second (TPS) with sub-second latency under favorable conditions, sustaining 610ms at 10,000 TPS and 755ms at 250,000 TPS. It remains robust under network glitches, showing minimal performance degradation even with a 1% message drop rate.

Paper Structure

This paper contains 66 sections, 13 theorems, 5 figures.

Key Result

lemma 1

For any two sets of replicas $Q_1, Q_2$ such that $|Q_1| \ge {\mathit}{ {}{\left\lceil \frac{n + f + 1}{2} \right\rceil}}$ and $|Q_2| \ge {\mathit}{ {}{\left\lceil \frac{n + f + 1}{2} \right\rceil}}$, there is at least one honest replica in $Q_1 \cap Q_2$.

Figures (5)

  • Figure 1: Illustration of the non-binary voting on prefixes in Raptr, as described in Section \ref{['sec:raptr:intuition']} and \ref{['sec:raptr:description']}. Four replicas ($R_1,\dots,R_4$) receive different subsets of batches (green) of the same block, and vote on the longest available prefix. A quorum of QC-votes form a quorum certificate (QC), certifying a block's prefix. In this example, suppose $S=2$, QC-votes from $R_1,R_2,R_3$ certify prefix 3, while QC-votes from $R_2,R_3,R_4$ certify prefix 4. After forming a QC, replicas then vote to commit the certified prefix. A quorum of CC-votes forms a commit certificate (CC), committing a prefix. Here, CC-votes from $R_1,R_2,R_3$ commit prefix 3, and those from $R_2,R_3,R_4$ commit prefix 4. The next block proposal extending either CC will extend the maximum certified prefix in the quorum, which is 4 in both case.
  • Figure 2: Common case performance of Raptr versus other protocols. The points represent the 50th percentile latency and the error bars show the 25th and 75th percentile latencies respectively.
  • Figure 3: Latency breakdown for Raptr and Aptos+ under fault-free conditions.
  • Figure 4: Impact of a partial network glitch on performance. Note the latency y-axis is in log-scale.
  • Figure 5: Impact of a full network glitch on performance. Note the latency y-axis is in log-scale.

Theorems & Definitions (30)

  • Definition 1: Byzantine Atomic Broadcast
  • Definition 2: Block Proposal
  • Definition 3: Latency Metrics
  • Definition 4: Non-Interactive Aggregate Signature
  • lemma 1: Supermajority Quorum Intersection
  • proof
  • lemma 2
  • proof
  • lemma 3
  • proof
  • ...and 20 more