Distributed Consensus Network: A Modularized Communication Framework and Reliability Probabilistic Analysis
Yuetai Li, Zhangchen Xu, Yiqi Wang, Zihan Zhou, Lei Zhang, Jon Crowcroft
TL;DR
This work introduces a modularized communication framework to analyze the reliability and latency of distributed consensus protocols operating over unreliable wireless links. By representing consensus communication as combinations of basic components (A/B/C) and defining activation probabilities, it yields a probabilistic model with a main theorem for consensus success $P_C$ and tractable approximations for the failure rate $P_F$, along with latency implications. It defines Reliability Gain and Tolerance Gain to quantify how joint failure rates and fault-tolerance levels influence performance, and demonstrates two latency-reduction strategies, including network scaling and optimized power allocation, validated by a RAFT prototype. The results offer theoretical guidance for designing low-failure, low-latency consensus systems in imperfect communication environments and establish a pathway for extending the framework to other protocols like PBFT and HotStuff.
Abstract
In this paper, we propose a modularized framework for communication processes applicable to crash and Byzantine fault-tolerant consensus protocols. We abstract basic communication components and show that the communication process of the classic consensus protocols such as RAFT, single-decree Paxos, PBFT, and Hotstuff, can be represented by the combination of communication components. Based on the proposed framework, we develop an approach to analyze the consensus reliability of different protocols, where link loss and node failure are measured as a probability. We propose two latency optimization methods and implement a RAFT system to verify our theoretical analysis and the effectiveness of the proposed latency optimization methods. We also discuss decreasing consensus failure rate by adjusting protocol designs. This paper provides theoretical guidance for the design of future consensus systems with a low consensus failure rate and latency under the possible communication loss.
