Table of Contents
Fetching ...

Efficient Storage Integrity in Adversarial Settings

Quinn Burke, Ryan Sheatsley, Yohan Beugin, Eric Pauley, Owen Hines, Michael Swift, Patrick McDaniel

TL;DR

This work addresses the challenge of maintaining data integrity, freshness, and transactional consistency on untrusted storage without incurring prohibitive overhead. It introduces Partially Asynchronous Integrity Checking (PAC), a hybrid approach that performs synchronous reads while handling writes asynchronously via a secure in-memory queue and coordinated flush-based sealing, delivering strong integrity guarantees with low overhead. The authors formalize read and write guarantees, implement PAC as a Linux block-device driver, and demonstrate substantial performance benefits over prior hash-tree designs (up to 5.5× throughput) while achieving about the same throughput as encryption-only baselines (≈85% of AEAD). The results show PAC scales to large storage sizes, maintains low memory overhead, and enables practical, integrity-critical workloads on untrusted cloud storage, with open-source code available for deployment.

Abstract

Storage integrity is essential to systems and applications that use untrusted storage (e.g., public clouds, end-user devices). However, known methods for achieving storage integrity either suffer from high (and often prohibitive) overheads or provide weak integrity guarantees. In this work, we demonstrate a hybrid approach to storage integrity that simultaneously reduces overhead while providing strong integrity guarantees. Our system, partially asynchronous integrity checking (PAC), allows disk write commitments to be deferred while still providing guarantees around read integrity. PAC delivers a 5.5X throughput and latency improvement over the state of the art, and 85% of the throughput achieved by non-integrity-assuring approaches. In this way, we show that untrusted storage can be used for integrity-critical workloads without meaningfully sacrificing performance.

Efficient Storage Integrity in Adversarial Settings

TL;DR

This work addresses the challenge of maintaining data integrity, freshness, and transactional consistency on untrusted storage without incurring prohibitive overhead. It introduces Partially Asynchronous Integrity Checking (PAC), a hybrid approach that performs synchronous reads while handling writes asynchronously via a secure in-memory queue and coordinated flush-based sealing, delivering strong integrity guarantees with low overhead. The authors formalize read and write guarantees, implement PAC as a Linux block-device driver, and demonstrate substantial performance benefits over prior hash-tree designs (up to 5.5× throughput) while achieving about the same throughput as encryption-only baselines (≈85% of AEAD). The results show PAC scales to large storage sizes, maintains low memory overhead, and enables practical, integrity-critical workloads on untrusted cloud storage, with open-source code available for deployment.

Abstract

Storage integrity is essential to systems and applications that use untrusted storage (e.g., public clouds, end-user devices). However, known methods for achieving storage integrity either suffer from high (and often prohibitive) overheads or provide weak integrity guarantees. In this work, we demonstrate a hybrid approach to storage integrity that simultaneously reduces overhead while providing strong integrity guarantees. Our system, partially asynchronous integrity checking (PAC), allows disk write commitments to be deferred while still providing guarantees around read integrity. PAC delivers a 5.5X throughput and latency improvement over the state of the art, and 85% of the throughput achieved by non-integrity-assuring approaches. In this way, we show that untrusted storage can be used for integrity-critical workloads without meaningfully sacrificing performance.

Paper Structure

This paper contains 19 sections, 2 theorems, 17 figures, 2 tables.

Key Result

Theorem 1

In the absence of system crashes, PAC always returns the most recent version of data to the caller on reads or the integrity check fails. ( Read guarantee)

Figures (17)

  • Figure 1: We consider an IaaS deployment model where an application runs inside a guest VM and stores data on a fast, local NVMe disk.
  • Figure 2: A Merkle hash tree protects the integrity of data read from/written to a storage device.
  • Figure 3: We assume that VM memory contents are trusted and cloud disks are untrusted; VM memory can be protected with trusted execution primitives aws_sev_snp.
  • Figure 4: Aggregate read/write throughput vs. capacity and I/O size. Experiment parameters: Workload: Zipf(2.5), Read ratio: 1%, Cache size: 10%, Capacity: 1 TB, I/O size: 32 KB, Threads: 1, I/O depth: 32.
  • Figure 5: Batched processing (Note: chk=checkpoint step).
  • ...and 12 more figures

Theorems & Definitions (4)

  • Theorem 1
  • proof
  • Theorem 2
  • proof