Consistent Updates for Scalable Microservices
Devora Chait-Roth, Kedar S. Namjoshi, Thomas Wies
TL;DR
The paper tackles the challenge of updating live scalable microservices without exposing clients to mixed-version inconsistencies. It develops a formal framework for update consistency, modeling services as labeled transition systems and proving that updates must look atomic to each client; it then derives provably correct mixed-mode update algorithms based on semantic properties like commutativity and compatibility. A central contribution is showing an impossibility result: semantically oblivious updates cannot guarantee consistency, thus runtime semantic analysis is necessary. The work provides a foundational, in-place approach for consistent rolling updates and lays out induction-based proof techniques, enabling future extension to networks of microservices. The results have practical impact by enabling efficient, provably consistent updates without resorting to replication or throughput-hampering shutdowns.
Abstract
Online services are commonly implemented with a scalable microservice architecture, where isomorphic workers process client requests, recording persistent state in a backend data store. To maintain service, modifications to service functionality must be made on the fly -- i.e., as the service continues to process client requests -- but doing so is challenging. The central difficulty is that of avoiding inconsistencies from mixed-mode operation, caused by workers of current and new versions interacting via the data store. Some update methods avoid mixed-mode altogether, but only at the cost of substantial inefficiency -- by doubling resources (memory and compute), or by halving throughput. The alternative is an uncontrolled ``rolling'' update, which runs the risk of serious service failures arising from inconsistent mixed-mode behavior. Ideally, it should appear to every client that a service update takes effect atomically; this ensures that a client is not exposed to inconsistent mixed-mode behavior. In this paper, we introduce a framework that formalizes this intuition and develop foundational theory for reasoning about update consistency. We apply this theory to derive the first algorithms that guarantee consistency for mixed-mode updates. The algorithms rely on semantic properties of service actions, such as commutativity. We show that this is unavoidable, by proving that any semantically oblivious mixed-mode update method must allow inconsistencies.
