Table of Contents
Fetching ...

Consistent Updates for Scalable Microservices

Devora Chait-Roth, Kedar S. Namjoshi, Thomas Wies

TL;DR

The paper tackles the challenge of updating live scalable microservices without exposing clients to mixed-version inconsistencies. It develops a formal framework for update consistency, modeling services as labeled transition systems and proving that updates must look atomic to each client; it then derives provably correct mixed-mode update algorithms based on semantic properties like commutativity and compatibility. A central contribution is showing an impossibility result: semantically oblivious updates cannot guarantee consistency, thus runtime semantic analysis is necessary. The work provides a foundational, in-place approach for consistent rolling updates and lays out induction-based proof techniques, enabling future extension to networks of microservices. The results have practical impact by enabling efficient, provably consistent updates without resorting to replication or throughput-hampering shutdowns.

Abstract

Online services are commonly implemented with a scalable microservice architecture, where isomorphic workers process client requests, recording persistent state in a backend data store. To maintain service, modifications to service functionality must be made on the fly -- i.e., as the service continues to process client requests -- but doing so is challenging. The central difficulty is that of avoiding inconsistencies from mixed-mode operation, caused by workers of current and new versions interacting via the data store. Some update methods avoid mixed-mode altogether, but only at the cost of substantial inefficiency -- by doubling resources (memory and compute), or by halving throughput. The alternative is an uncontrolled ``rolling'' update, which runs the risk of serious service failures arising from inconsistent mixed-mode behavior. Ideally, it should appear to every client that a service update takes effect atomically; this ensures that a client is not exposed to inconsistent mixed-mode behavior. In this paper, we introduce a framework that formalizes this intuition and develop foundational theory for reasoning about update consistency. We apply this theory to derive the first algorithms that guarantee consistency for mixed-mode updates. The algorithms rely on semantic properties of service actions, such as commutativity. We show that this is unavoidable, by proving that any semantically oblivious mixed-mode update method must allow inconsistencies.

Consistent Updates for Scalable Microservices

TL;DR

The paper tackles the challenge of updating live scalable microservices without exposing clients to mixed-version inconsistencies. It develops a formal framework for update consistency, modeling services as labeled transition systems and proving that updates must look atomic to each client; it then derives provably correct mixed-mode update algorithms based on semantic properties like commutativity and compatibility. A central contribution is showing an impossibility result: semantically oblivious updates cannot guarantee consistency, thus runtime semantic analysis is necessary. The work provides a foundational, in-place approach for consistent rolling updates and lays out induction-based proof techniques, enabling future extension to networks of microservices. The results have practical impact by enabling efficient, provably consistent updates without resorting to replication or throughput-hampering shutdowns.

Abstract

Online services are commonly implemented with a scalable microservice architecture, where isomorphic workers process client requests, recording persistent state in a backend data store. To maintain service, modifications to service functionality must be made on the fly -- i.e., as the service continues to process client requests -- but doing so is challenging. The central difficulty is that of avoiding inconsistencies from mixed-mode operation, caused by workers of current and new versions interacting via the data store. Some update methods avoid mixed-mode altogether, but only at the cost of substantial inefficiency -- by doubling resources (memory and compute), or by halving throughput. The alternative is an uncontrolled ``rolling'' update, which runs the risk of serious service failures arising from inconsistent mixed-mode behavior. Ideally, it should appear to every client that a service update takes effect atomically; this ensures that a client is not exposed to inconsistent mixed-mode behavior. In this paper, we introduce a framework that formalizes this intuition and develop foundational theory for reasoning about update consistency. We apply this theory to derive the first algorithms that guarantee consistency for mixed-mode updates. The algorithms rely on semantic properties of service actions, such as commutativity. We show that this is unavoidable, by proving that any semantically oblivious mixed-mode update method must allow inconsistencies.

Paper Structure

This paper contains 37 sections, 16 theorems, 3 equations, 1 figure, 6 algorithms.

Key Result

theorem 1

For $|M| \geq 2$, there does not exist an update algorithm that:

Figures (1)

  • Figure 1: Space-time diagram showing a global update cut (\ref{['sec:progress-measures']}), the downwards closure of clients' non-updated receives. Each process's timeline orders actions from left to right. Alice, Bob, and Claire are clients, $w_{a,b,c}$ are workers, and $DB$ is the database. The dotted curve connecting the cut points gives the full cut. The cut is update-consistent: all non-updated actions lie within the cut, while all updated actions lie outside the cut.

Theorems & Definitions (21)

  • definition 1
  • theorem 1: Impossibility result
  • lemma 1
  • lemma 2
  • theorem 2
  • theorem 3
  • lemma 3
  • definition 2
  • theorem 4
  • definition 3
  • ...and 11 more