Table of Contents
Fetching ...

The Case for DBMS Live Patching [Extended Version]

Michael Fruth, Stefanie Scherzinger

TL;DR

It is shown that live patching can be a viable option for updating database management systems, since database providers can make informed decisions w.r.t. the latency overhead on the client side.

Abstract

Traditionally, when the code of a database management system (DBMS) needs to be updated, the system is restarted and database clients suffer downtime, or the provider instantiates hot-standby instances and rolls over the workload. We investigate a third option, live patching of the DBMS binary. For certain code changes, live patching allows to modify the application code in memory, without restart. The memory state and all client connections can be maintained. Although live patching has been explored in the operating systems research community, it remains a blind spot in DBMS research. In this Experiment, Analysis & Benchmark article, we systematically explore this field from the DBMS perspective. We discuss what distinguishes database management systems from generic multi-threaded applications when it comes to live patching. We then propose domain-specific strategies for injecting quiescence points into the DBMS source code, so that threads can safely migrate to the patched process version. We experimentally investigate the interplay between the query workload and different quiescence methods, monitoring both transaction throughput and tail latencies. We show that live patching can be a viable option for updating database management systems, since database providers can make informed decisions w.r.t. the latency overhead on the client side.

The Case for DBMS Live Patching [Extended Version]

TL;DR

It is shown that live patching can be a viable option for updating database management systems, since database providers can make informed decisions w.r.t. the latency overhead on the client side.

Abstract

Traditionally, when the code of a database management system (DBMS) needs to be updated, the system is restarted and database clients suffer downtime, or the provider instantiates hot-standby instances and rolls over the workload. We investigate a third option, live patching of the DBMS binary. For certain code changes, live patching allows to modify the application code in memory, without restart. The memory state and all client connections can be maintained. Although live patching has been explored in the operating systems research community, it remains a blind spot in DBMS research. In this Experiment, Analysis & Benchmark article, we systematically explore this field from the DBMS perspective. We discuss what distinguishes database management systems from generic multi-threaded applications when it comes to live patching. We then propose domain-specific strategies for injecting quiescence points into the DBMS source code, so that threads can safely migrate to the patched process version. We experimentally investigate the interplay between the query workload and different quiescence methods, monitoring both transaction throughput and tail latencies. We show that live patching can be a viable option for updating database management systems, since database providers can make informed decisions w.r.t. the latency overhead on the client side.

Paper Structure

This paper contains 32 sections, 16 figures, 2 tables.

Figures (16)

  • Figure 1: Seamlessly live patching MariaDB through four code versions. Colors distinguish the code versions.
  • Figure 2: Live patching a DBMS: global quiescence (top) vs. local quiescence (bottom). Graphic based on Figure 2 of DBLP:conf/osdi/RommelDFKBMSL20.
  • Figure 3: Process memory layout after WfPatch operations.
  • Figure 4: Implementing one-thread-per-connection (top left) and thread pool policy (right block), inspired by MariaDB.
  • Figure 5: Query throughput over time for OLTP workloads, comparing MariaDB without patch application ("baseline", blue), patching in 5-second intervals with different setups. See \ref{['fig:appQps']} (\ref{['sec:appOLTP']}) for a chart displaying all five patch IDs. Bar charts showing the query throughput are also available in \ref{['fig:appBar']} (\ref{['sec:appBarQps']}).
  • ...and 11 more figures