Table of Contents
Fetching ...

Multi-Grained Specifications for Distributed System Model Checking and Verification

Lingzhi Ouyang, Xudong Sun, Ruize Tang, Yu Huang, Madhav Jivrajani, Xiaoxing Ma, Tianyin Xu

TL;DR

This paper uses TLA+ to model finegrained behaviors of ZooKeeper and uses the TLC model checker to verify its correctness properties; it shows that writing multi-grained specifications is a viable practice and can cope with model-code gaps without untenable state space, especially for evolving software where changes are typically local and incremental.

Abstract

This paper presents our experience specifying and verifying the correctness of ZooKeeper, a complex and evolving distributed coordination system. We use TLA+ to model fine-grained behaviors of ZooKeeper and use the TLC model checker to verify its correctness properties; we also check conformance between the model and code. The fundamental challenge is to balance the granularity of specifications and the scalability of model checking -- fine-grained specifications lead to state-space explosion, while coarse-grained specifications introduce model-code gaps. To address this challenge, we write specifications with different granularities for composable modules, and compose them into mixed-grained specifications based on specific scenarios. For example, to verify code changes, we compose fine-grained specifications of changed modules and coarse-grained specifications that abstract away details of unchanged code with preserved interactions. We show that writing multi-grained specifications is a viable practice and can cope with model-code gaps without untenable state space, especially for evolving software where changes are typically local and incremental. We detected six severe bugs that violate five types of invariants and verified their code fixes; the fixes have been merged to ZooKeeper. We also improve the protocol design to make it easy to implement correctly.

Multi-Grained Specifications for Distributed System Model Checking and Verification

TL;DR

This paper uses TLA+ to model finegrained behaviors of ZooKeeper and uses the TLC model checker to verify its correctness properties; it shows that writing multi-grained specifications is a viable practice and can cope with model-code gaps without untenable state space, especially for evolving software where changes are typically local and incremental.

Abstract

This paper presents our experience specifying and verifying the correctness of ZooKeeper, a complex and evolving distributed coordination system. We use TLA+ to model fine-grained behaviors of ZooKeeper and use the TLC model checker to verify its correctness properties; we also check conformance between the model and code. The fundamental challenge is to balance the granularity of specifications and the scalability of model checking -- fine-grained specifications lead to state-space explosion, while coarse-grained specifications introduce model-code gaps. To address this challenge, we write specifications with different granularities for composable modules, and compose them into mixed-grained specifications based on specific scenarios. For example, to verify code changes, we compose fine-grained specifications of changed modules and coarse-grained specifications that abstract away details of unchanged code with preserved interactions. We show that writing multi-grained specifications is a viable practice and can cope with model-code gaps without untenable state space, especially for evolving software where changes are typically local and incremental. We detected six severe bugs that violate five types of invariants and verified their code fixes; the fixes have been merged to ZooKeeper. We also improve the protocol design to make it easy to implement correctly.
Paper Structure (38 sections, 2 theorems, 8 figures, 6 tables)

This paper contains 38 sections, 2 theorems, 8 figures, 6 tables.

Key Result

Theorem 1

Given $S = \bigcup_{1 \le i \le n}M_{i}$ and $S_i = (\bigcup_{j \neq i}\widetilde{M_j}) \cup M_i$, we have $T_S \stackrel{M_i}{\sim} T_{S_i}$.

Figures (8)

  • Figure 1: Pen-and-paper description and TLA$^+$ specification of Step $f.2.1$ in Phase 2 of the Zab protocol.
  • Figure 2: Code implementation in Java and the corresponding system specification in TLA$^+$ of Step $f.2.1$ in Phase 2 of the Zab protocol in ZooKeeper.
  • Figure 3: Fine-grained modeling that splits the atomic action, FollowerProcessNEWLEADER in Figure \ref{['subfig:zab1.0-spec']}, into three actions (Actions 1--3).
  • Figure 4: Fine-grained modeling on concurrency with async logging.queuedRequests is used in Figure \ref{['subfig:refine-part2']}.
  • Figure 5: Interaction-preserving coarsening of the eight actions in the Election and Discovery phases.
  • ...and 3 more figures

Theorems & Definitions (5)

  • Theorem 1: Interaction Preservation
  • Theorem 2: Interaction Preservation
  • Definition 1: module
  • definition 1: dependency variable
  • definition 2: interaction variable