VeriFx: Correct Replicated Data Types for the Masses
Kevin De Porre, Carla Ferreira, Elisa Gonzalez Boix
TL;DR
VeriFx tackles the challenge of correct replicated data types by providing a high-level functional language with automated verification for CRDTs and OT. It couples an architecture that translates VeriFx programs to a decidable SMT theory with libraries for CRDTs and OT, enabling end-to-end verification on real-world designs and automatic transpilation to Scala or JavaScript. The paper demonstrates verification of 35 CRDTs and multiple OT functions within seconds, including automated counterexamples when properties fail. This approach lowers the barrier for developers to implement and verify RDTs inside the same language, facilitating safer deployment in distributed systems with weak consistency guarantees.
Abstract
Distributed systems adopt weak consistency to ensure high availability and low latency, but state convergence is hard to guarantee due to conflicts. Experts carefully design replicated data types (RDTs) that resemble sequential data types and embed conflict resolution mechanisms that ensure convergence. Designing RDTs is challenging as their correctness depends on subtleties such as the ordering of concurrent operations. Currently, researchers manually verify RDTs, either by paper proofs or using proof assistants. Unfortunately, paper proofs are subject to reasoning flaws and mechanized proofs verify a formalisation instead of a real-world implementation. Furthermore, writing mechanized proofs is reserved to verification experts and is extremely time consuming. To simplify the design, implementation, and verification of RDTs, we propose VeriFx, a high-level programming language with automated proof capabilities. VeriFx lets programmers implement RDTs atop functional collections and express correctness properties that are verified automatically. Verified RDTs can be transpiled to mainstream languages (currently Scala or JavaScript). VeriFx also provides libraries for implementing and verifying Conflict-free Replicated Data Types (CRDTs) and Operational Transformation (OT) functions. These libraries implement the general execution model of those approaches and define their correctness properties. We use the libraries to implement and verify an extensive portfolio of 35 CRDTs and reproduce a study on the correctness of OT functions.
