Table of Contents
Fetching ...

Notes on Theory of Distributed Systems

James Aspnes

TL;DR

Aspnes' notes synthesize foundational distributed computing theory through a comprehensive survey of models, impossibility results, and classic algorithms. The text surveys message-passing and shared-memory paradigms, covering synchronization, consensus, failure detectors, quorums, and permissionless systems, and culminates with advanced constructs such as Paxos, atomic snapshots, and the wait-free hierarchy. A central thread is balancing correctness (safety/consistency) with progress (liveness) under failures and asynchrony, using tools like causal ordering, logical clocks, synchronizers, and failure detectors to navigate FLP-type limits. The practical payoff is a deep understanding of how distributed protocols achieve reliable coordination, replication, and consensus across diverse architectures, including open, permissionless networks and blockchain-oriented systems, while rigorously detailing limits and trade-offs. Practical impact includes guidance for designing robust distributed protocols (e.g., Paxos variants, failure detectors, and quorum systems) that tolerate crashes, Byzantine faults, or Sybil attacks, and for reasoning about the fundamental costs of synchronization and coordination in large-scale systems.

Abstract

Notes for the Yale course CPSC 465/565 Theory of Distributed Systems. Table of Contents: 1 Introduction, 2 Model, 3 Broadcast and convergecast, 4 Distributed breadth-first search, 5 Leader election, 6 Causal ordering and logical clocks, 7 Synchronizers, 8 Coordinated attack, 9 Synchronous agreement, 10 Byzantine agreement, 11 Impossibility of asynchronous agreement, 12 Paxos, 13 Failure detectors, 14 Quorum systems, 15 Permissionless systems, 16 Model, 17 Distributed shared memory, 18 Mutual exclusion, 19 The wait-free hierarchy, 20 Atomic snapshots, 21 Lower bounds on perturbable objects, 22 Restricted-use objects, 23 Common2, 24 Randomized consensus and test-and-set, 25 Renaming, 26 Software transactional memory, 27 Obstruction-freedom, 28 BG simulation, 29 Topological methods, 30 Approximate agreement, 31 Overview, 32 Self-stabilization, 33 Distributed graph algorithms, 34 Population protocols, 35 Mobile robots, 36 Beeping

Notes on Theory of Distributed Systems

TL;DR

Aspnes' notes synthesize foundational distributed computing theory through a comprehensive survey of models, impossibility results, and classic algorithms. The text surveys message-passing and shared-memory paradigms, covering synchronization, consensus, failure detectors, quorums, and permissionless systems, and culminates with advanced constructs such as Paxos, atomic snapshots, and the wait-free hierarchy. A central thread is balancing correctness (safety/consistency) with progress (liveness) under failures and asynchrony, using tools like causal ordering, logical clocks, synchronizers, and failure detectors to navigate FLP-type limits. The practical payoff is a deep understanding of how distributed protocols achieve reliable coordination, replication, and consensus across diverse architectures, including open, permissionless networks and blockchain-oriented systems, while rigorously detailing limits and trade-offs. Practical impact includes guidance for designing robust distributed protocols (e.g., Paxos variants, failure detectors, and quorum systems) that tolerate crashes, Byzantine faults, or Sybil attacks, and for reasoning about the fundamental costs of synchronization and coordination in large-scale systems.

Abstract

Notes for the Yale course CPSC 465/565 Theory of Distributed Systems. Table of Contents: 1 Introduction, 2 Model, 3 Broadcast and convergecast, 4 Distributed breadth-first search, 5 Leader election, 6 Causal ordering and logical clocks, 7 Synchronizers, 8 Coordinated attack, 9 Synchronous agreement, 10 Byzantine agreement, 11 Impossibility of asynchronous agreement, 12 Paxos, 13 Failure detectors, 14 Quorum systems, 15 Permissionless systems, 16 Model, 17 Distributed shared memory, 18 Mutual exclusion, 19 The wait-free hierarchy, 20 Atomic snapshots, 21 Lower bounds on perturbable objects, 22 Restricted-use objects, 23 Common2, 24 Randomized consensus and test-and-set, 25 Renaming, 26 Software transactional memory, 27 Obstruction-freedom, 28 BG simulation, 29 Topological methods, 30 Approximate agreement, 31 Overview, 32 Self-stabilization, 33 Distributed graph algorithms, 34 Population protocols, 35 Mobile robots, 36 Beeping

Paper Structure

This paper contains 525 sections, 49 theorems, 35 equations, 19 figures, 1 table, 106 algorithms.

Key Result

Theorem 4.1.1

Every process receives $M$ after at most $D$ time and at most $\lvert*\rvert{E}$ messages, where $D$ is the diameter of the network and $E$ is the set of (directed) edges in the network.

Figures (19)

  • Figure 1: Asynchronous message-passing execution. Time flows left-to-right. Horizontal lines represent processes. Nodes represent events. Diagonal edges between events represent messages. In this execution, $p_1$ executes a computation event that sends messages to $p_2$ and $p_3$. When $p_2$ receives this message, it sends messages to $p_1$ and $p_3$. Later, $p_2$ executes a computation event that sends a second message to $p_1$. Because the system is asynchronous, there is no guarantee that messages arrive in the same order they are sent.
  • Figure 2: Asynchronous message-passing execution with FIFO channels. Multiple messages from one process to another are now guaranteed to be delivered in the order they are sent.
  • Figure 3: Synchronous message-passing execution. All messages are now delivered in exactly one time unit, and computation events immediately follow the delivery events.
  • Figure 4: Asynchronous message-passing execution with times.
  • Figure 5: Labels in the bit-reversal ring with $n=32$
  • ...and 14 more figures

Theorems & Definitions (94)

  • Theorem 4.1.1
  • proof
  • Lemma 4.1.2
  • proof
  • Lemma 5.1.1
  • proof
  • Lemma 6.1.1
  • proof
  • Lemma 7.1.1
  • proof
  • ...and 84 more