Table of Contents
Fetching ...

On Scalable Integrity Checking for Secure Cloud Disks

Quinn Burke, Ryan Sheatsley, Rachel King, Owen Hines, Michael Swift, Patrick McDaniel

TL;DR

Merkle hash trees protect data integrity but incur significant CPU and I/O costs on cloud block storage as capacity grows. The authors recast the optimal hash-tree design as an optimal prefix code (Huffman) problem and propose Dynamic Merkle Trees (DMTs) that online-adapt to skewed workloads via randomized splaying and per-node hotness tracking. In extensive cloud-scale experiments, DMTs achieve up to 2.2x throughput gains and maintain high efficiency across workloads, traces, and OLTP scenarios, outperforming traditional balanced and high-degree trees. The work demonstrates a practical, scalable integrity mechanism that leverages workload patterns and is released as open-source, enabling deployment in real cloud environments.

Abstract

Merkle hash trees are the standard method to protect the integrity and freshness of stored data. However, hash trees introduce additional compute and I/O costs on the I/O critical path, and prior efforts have not fully characterized these costs. In this paper, we quantify performance overheads of storage-level hash trees in realistic settings. We then design an optimized tree structure called Dynamic Merkle Trees (DMTs) based on an analysis of root causes of overheads. DMTs exploit patterns in workloads to deliver up to a 2.2x throughput and latency improvement over the state of the art. Our novel approach provides a promising new direction to achieve integrity guarantees in storage efficiently and at scale.

On Scalable Integrity Checking for Secure Cloud Disks

TL;DR

Merkle hash trees protect data integrity but incur significant CPU and I/O costs on cloud block storage as capacity grows. The authors recast the optimal hash-tree design as an optimal prefix code (Huffman) problem and propose Dynamic Merkle Trees (DMTs) that online-adapt to skewed workloads via randomized splaying and per-node hotness tracking. In extensive cloud-scale experiments, DMTs achieve up to 2.2x throughput gains and maintain high efficiency across workloads, traces, and OLTP scenarios, outperforming traditional balanced and high-degree trees. The work demonstrates a practical, scalable integrity mechanism that leverages workload patterns and is released as open-source, enabling deployment in real cloud environments.

Abstract

Merkle hash trees are the standard method to protect the integrity and freshness of stored data. However, hash trees introduce additional compute and I/O costs on the I/O critical path, and prior efforts have not fully characterized these costs. In this paper, we quantify performance overheads of storage-level hash trees in realistic settings. We then design an optimized tree structure called Dynamic Merkle Trees (DMTs) based on an analysis of root causes of overheads. DMTs exploit patterns in workloads to deliver up to a 2.2x throughput and latency improvement over the state of the art. Our novel approach provides a promising new direction to achieve integrity guarantees in storage efficiently and at scale.
Paper Structure (17 sections, 1 theorem, 2 equations, 18 figures, 3 tables)

This paper contains 17 sections, 1 theorem, 2 equations, 18 figures, 3 tables.

Key Result

Theorem 1

A hash tree constructed as an optimal prefix code is optimal for an i.i.d. access probability distribution.

Figures (18)

  • Figure 1: We assume that VM memory contents are trusted and cloud storage devices are untrusted; VM memory can be protected with trusted execution primitives aws_sev_snp.
  • Figure 2: A Merkle hash tree protects the integrity and freshness of data read from/written to a storage device.
  • Figure 3: This graph shows how throughput decreases w.r.t. capacity under an exemplar setup and workload. Experiment parameters: Workload: Zipf(2.5), Read ratio: 1%, I/O size: 32 KB, Cache size: 10%.
  • Figure 4: CPU vs. I/O time during the driver write routine. Same experiment parameters as above.
  • Figure 5: This graph shows the latency of computing SHA256 hashes on a modern processor with hardware acceleration for cryptographic functions. The annotations highlight the input data size to the hash function at different tree arities.
  • ...and 13 more figures

Theorems & Definitions (2)

  • Theorem 1
  • proof