Table of Contents
Fetching ...

Epoch-based Optimistic Concurrency Control in Geo-replicated Databases

Yunhao Mao, Harunari Takata, Michail Bachras, Yuqiu Zhang, Shiquan Zhang, Gengrui Zhang, Hans-Arno Jacobsen

TL;DR

Minerva is introduced, a unified distributed concurrency control designed for highly scalable multi-leader replication that employs a novel epoch-based asynchronous replication protocol that decouples data propagation from the commitment process, enabling continuous transaction replication.

Abstract

Geo-distribution is essential for modern online applications to ensure service reliability and high availability. However, supporting high-performance serializable transactions in geo-replicated databases remains a significant challenge. This difficulty stems from the extensive over-coordination inherent in distributed atomic commitment, concurrency control, and fault-tolerance replication protocols under high network latency. To address these challenges, we introduce Minerva, a unified distributed concurrency control designed for highly scalable multi-leader replication. Minerva employs a novel epoch-based asynchronous replication protocol that decouples data propagation from the commitment process, enabling continuous transaction replication. Optimistic concurrency control is used to allow any replicas to execute transactions concurrently and commit without coordination. In stead of aborting transactions when conflicts are detected, Minerva uses deterministic re-execution to resolve conflicts, ensuring serializability without sacrificing performance. To further enhance concurrency, we construct a conflict graph and use a maximum weight independent set algorithm to select the optimal subset of transactions for commitment, minimizing the number of re-executed transactions. Our evaluation demonstrates that Minerva significantly outperforms state-of-the-art replicated databases, achieving over $3\times$ higher throughput in scalability experiments and $2.8\times$ higher throughput during a high network latency simulation with the TPC-C benchmark.

Epoch-based Optimistic Concurrency Control in Geo-replicated Databases

TL;DR

Minerva is introduced, a unified distributed concurrency control designed for highly scalable multi-leader replication that employs a novel epoch-based asynchronous replication protocol that decouples data propagation from the commitment process, enabling continuous transaction replication.

Abstract

Geo-distribution is essential for modern online applications to ensure service reliability and high availability. However, supporting high-performance serializable transactions in geo-replicated databases remains a significant challenge. This difficulty stems from the extensive over-coordination inherent in distributed atomic commitment, concurrency control, and fault-tolerance replication protocols under high network latency. To address these challenges, we introduce Minerva, a unified distributed concurrency control designed for highly scalable multi-leader replication. Minerva employs a novel epoch-based asynchronous replication protocol that decouples data propagation from the commitment process, enabling continuous transaction replication. Optimistic concurrency control is used to allow any replicas to execute transactions concurrently and commit without coordination. In stead of aborting transactions when conflicts are detected, Minerva uses deterministic re-execution to resolve conflicts, ensuring serializability without sacrificing performance. To further enhance concurrency, we construct a conflict graph and use a maximum weight independent set algorithm to select the optimal subset of transactions for commitment, minimizing the number of re-executed transactions. Our evaluation demonstrates that Minerva significantly outperforms state-of-the-art replicated databases, achieving over higher throughput in scalability experiments and higher throughput during a high network latency simulation with the TPC-C benchmark.
Paper Structure (39 sections, 4 theorems, 12 figures, 1 table, 5 algorithms)

This paper contains 39 sections, 4 theorems, 12 figures, 1 table, 5 algorithms.

Key Result

lemma 1

If a Proof of Availability (PoA) is received by a correct replica, then at least one correct replica has received and stored the corresponding batch.

Figures (12)

  • Figure 1: Single-leader vs multi-leader replication. In multi-leader, the layers for AC (Atomic Commit), CC (Concurrency Control), and replication among R (replicas) are no longer orthogonal to each other as in primary-backup, leading to redundant communication.
  • Figure 2: Architecture of Minerva, illustrating the transaction workflow across three replicas. 1 Clients submit transactions to any replica. 2 The local replica optimistically executes transactions. 3 Inputs, read-sets, and write-sets are encapsulated in TransactionRecords. 4 These records are batched and asynchronously propagated to peer replicas. 5 A periodic commitment process triggers conflict resolution and deterministic re-execution. 6 Committed results are persisted and returned to clients.
  • Figure 3: Transactions and Batches
  • Figure 4: Replica A's view of the logs: Brown means committed, blue means available and has PoA, green means broadcast but without PoA, and white means received but not available. Red dotted line is the coordinator's consistent cut.
  • Figure 5: Conflict Graph of 6 transaction chains, arrows mean read-write or write-write conflicts, and red box shows the conflicting transaction chains ($TxnC$) that are invalidated and to be re-executed.
  • ...and 7 more figures

Theorems & Definitions (5)

  • lemma 1
  • lemma 2
  • lemma 3
  • definition 1
  • theorem 1