Table of Contents
Fetching ...

ReCraft: Self-Contained Split, Merge, and Membership Change of Raft Protocol

Kezhi Xiong, Soonwon Moon, Joshua Kang, Bryant Curto, Jieung Kim, Ji-Yong Shin

TL;DR

ReCraft tackles the challenge of reconfiguring consensus systems by introducing a self-contained Raft reconfiguration protocol that supports split, merge, and membership changes without external coordinators, thus eliminating a major single point of failure. It combines epoch-based configurations, targeted quorum management, and a pull-based catch-up mechanism to preserve safety while enabling concurrent reconfiguration. The approach is formalized with safety and liveness proofs and mechanized in Rocq, and is implemented and evaluated within etcd, showing negligible overhead and favorable performance during splits and merges compared to emulated alternatives. The work significantly advances scalable, fault-tolerant multi-cluster Raft by providing a practical, robust alternative to external cluster management for large-scale deployments.

Abstract

Designing reconfiguration schemes for consensus protocols is challenging because subtle corner cases during reconfiguration could invalidate the correctness of the protocol. Thus, most systems that embed consensus protocols conservatively implement the reconfiguration and refrain from developing an efficient scheme. Existing implementations often stop the entire system during reconfiguration and rely on a centralized coordinator, which can become a single point of failure. We present ReCraft, a novel reconfiguration protocol for Raft, which supports multi- and single-cluster-level reconfigurations. ReCraft does not rely on external coordinators and blocks minimally. ReCraft enables the sharding of Raft clusters with split and merge reconfigurations and adds a membership change scheme that improves Raft. We prove the safety and liveness of ReCraft and demonstrate its efficiency through implementations in etcd.

ReCraft: Self-Contained Split, Merge, and Membership Change of Raft Protocol

TL;DR

ReCraft tackles the challenge of reconfiguring consensus systems by introducing a self-contained Raft reconfiguration protocol that supports split, merge, and membership changes without external coordinators, thus eliminating a major single point of failure. It combines epoch-based configurations, targeted quorum management, and a pull-based catch-up mechanism to preserve safety while enabling concurrent reconfiguration. The approach is formalized with safety and liveness proofs and mechanized in Rocq, and is implemented and evaluated within etcd, showing negligible overhead and favorable performance during splits and merges compared to emulated alternatives. The work significantly advances scalable, fault-tolerant multi-cluster Raft by providing a practical, robust alternative to external cluster management for large-scale deployments.

Abstract

Designing reconfiguration schemes for consensus protocols is challenging because subtle corner cases during reconfiguration could invalidate the correctness of the protocol. Thus, most systems that embed consensus protocols conservatively implement the reconfiguration and refrain from developing an efficient scheme. Existing implementations often stop the entire system during reconfiguration and rely on a centralized coordinator, which can become a single point of failure. We present ReCraft, a novel reconfiguration protocol for Raft, which supports multi- and single-cluster-level reconfigurations. ReCraft does not rely on external coordinators and blocks minimally. ReCraft enables the sharding of Raft clusters with split and merge reconfigurations and adds a membership change scheme that improves Raft. We prove the safety and liveness of ReCraft and demonstrate its efficiency through implementations in etcd.

Paper Structure

This paper contains 31 sections, 3 theorems, 8 figures, 1 table.

Key Result

Theorem 1

If a node has applied a log entry at a given index to its state machine, no other node will ever apply a different log entry for the same index.

Figures (8)

  • Figure 1: Reconfiguring a 2-node cluster ($C_{old}$) to a 5-node cluster ($C_{new}$) using Raft and ReCraft reconfiguration schemes.
  • Figure 2: Pseudo code of functions for the split protocol. Functions in red include communications: appendEntry, requestVote and PullLog are 1-to-1 RPCs, notifyCommit is a multicast RPC, and respondPull sends an acknoweldgment.
  • Figure 3: An example of a series of split and merge operations from a cluster-level viewpoint. Rounded rectangles show clusters' node configurations and cluster's committed log states are on the left. $C_{old}$ splits to $C_{sub.1-3}$ (steps a-d), and $C_{sub.1}$ and $C_{sub.2}$ merge to $C'_{new}$ (steps d-h).
  • Figure 4: Pseudo code of functions for the two-phase commit of the merge protocol. Functions in red include communications: Prepare2PC, and Commit2PC are 1-to-1 RPCs, appendEntryToConf is a multicast RPC, and respond sends an acknowledgement.
  • Figure 5: The number of additional votes ReCraft requires during intermediate steps of the membership change compared to the best cases and the worst cases of the JC.
  • ...and 3 more figures

Theorems & Definitions (15)

  • Theorem 1: State machine safety
  • Definition 1: Leader Append-Only
  • Definition 2: Election Safety
  • Definition 3: Log Matching
  • Definition 4: Leader Completeness
  • Definition 5: Consensuses and quorums
  • Definition 6: Cluster Well-Formedness
  • Definition 7: Log Consistency
  • Lemma 1: Leader Completeness
  • Proof 1
  • ...and 5 more