Table of Contents
Fetching ...

OptiLog: Assigning Roles in Byzantine Consensus

Hanish Gogada, Christian Berger, Leander Jehl, Hans P. Reiser, Hein Meling

TL;DR

Opti-Log addresses the scalability of Byzantine fault-tolerant protocols in wide-area deployments by introducing a measurement-driven, append-only log that unifies local replica measurements into a global view for consistent role assignment and accountability. It combines latency, misbehavior, and suspicion monitoring with a configurable search process (including simulated annealing) to adaptively select low-latency configurations, while excluding faulty replicas from critical roles. The approach is instantiated in Opti-Aware (PBFT-like) and Opti-Tree (tree-based) protocols, achieving up to 39% lower latency and up to 2.5x higher throughput in sizable WAN-scale tests, and demonstrating resilience to timing attacks and misbehavior with manageable overhead. The work shows that principled, log-based measurement collection can sustain high performance under faults and adversarial delays, offering practical impact for large-scale distributed ledgers and consensus systems.

Abstract

Byzantine Fault-Tolerant (BFT) protocols play an important role in blockchains. As the deployment of such systems extends to wide-area networks, the scalability of BFT protocols becomes a critical concern. Optimizations that assign specific roles to individual replicas can significantly improve the performance of BFT systems. However, such role assignment is highly sensitive to faults, potentially undermining the optimizations' effectiveness. To address these challenges, we present OptiLog, a logging framework for collecting and analyzing measurements that help to assign roles in globally distributed systems, despite the presence of faults. OptiLog presents local measurements in global data structures, to enable consistent decisions and hold replicas accountable if they do not perform according to their reported measurements. We demonstrate OptiLog's flexibility by applying it to two BFT protocols: (1) Aware, a highly optimized PBFT-like protocol, and (2) Kauri, a tree-based protocol designed for large-scale deployments. OptiLog detects and excludes replicas that misbehave during consensus and thus enables the system to operate in an optimized, low-latency configuration, even under adverse conditions. Experiments show that for tree overlays deployed across 73 worldwide cities, trees found by OptiLog display 39% lower latency than Kauri.

OptiLog: Assigning Roles in Byzantine Consensus

TL;DR

Opti-Log addresses the scalability of Byzantine fault-tolerant protocols in wide-area deployments by introducing a measurement-driven, append-only log that unifies local replica measurements into a global view for consistent role assignment and accountability. It combines latency, misbehavior, and suspicion monitoring with a configurable search process (including simulated annealing) to adaptively select low-latency configurations, while excluding faulty replicas from critical roles. The approach is instantiated in Opti-Aware (PBFT-like) and Opti-Tree (tree-based) protocols, achieving up to 39% lower latency and up to 2.5x higher throughput in sizable WAN-scale tests, and demonstrating resilience to timing attacks and misbehavior with manageable overhead. The work shows that principled, log-based measurement collection can sustain high performance under faults and adversarial delays, offering practical impact for large-scale distributed ledgers and consensus systems.

Abstract

Byzantine Fault-Tolerant (BFT) protocols play an important role in blockchains. As the deployment of such systems extends to wide-area networks, the scalability of BFT protocols becomes a critical concern. Optimizations that assign specific roles to individual replicas can significantly improve the performance of BFT systems. However, such role assignment is highly sensitive to faults, potentially undermining the optimizations' effectiveness. To address these challenges, we present OptiLog, a logging framework for collecting and analyzing measurements that help to assign roles in globally distributed systems, despite the presence of faults. OptiLog presents local measurements in global data structures, to enable consistent decisions and hold replicas accountable if they do not perform according to their reported measurements. We demonstrate OptiLog's flexibility by applying it to two BFT protocols: (1) Aware, a highly optimized PBFT-like protocol, and (2) Kauri, a tree-based protocol designed for large-scale deployments. OptiLog detects and excludes replicas that misbehave during consensus and thus enables the system to operate in an optimized, low-latency configuration, even under adverse conditions. Experiments show that for tree overlays deployed across 73 worldwide cities, trees found by OptiLog display 39% lower latency than Kauri.

Paper Structure

This paper contains 47 sections, 12 theorems, 3 equations, 15 figures, 2 tables.

Key Result

Lemma 1

There are always at least $\mathit{n}\xspace-\mathit{f}\xspace$ candidates available in $\mathcal{K}\xspace$.

Figures (15)

  • Figure 1: Opti-Log's component architecture.
  • Figure 2: Replicas ($\mathit{A}$,$\mathit{B}$,$\mathit{C}$,$\mathit{D}$) with sensors and monitors.
  • Figure 3: Connections between Replica $\mathit{A}$'s sensors and monitors.
  • Figure 4: PBFT message pattern showing expected round duration $d_{rnd}$ and message delay $d_m$.
  • Figure 5: Tree with $\mathit{n}\xspace=13$ replicas and branch factor $\mathit{b}\xspace=3$.
  • ...and 10 more figures

Theorems & Definitions (13)

  • Definition 1
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • theorem 1
  • Lemma 4
  • Lemma 5
  • Lemma 6
  • Lemma 7
  • Lemma 8
  • ...and 3 more