Table of Contents
Fetching ...

Buffered Partially-Persistent External-Memory Search Trees

Gerth Stølting Brodal, Casper Moldrup Rysgaard, Rolf Svenning

TL;DR

The paper addresses maintaining a buffered partially-persistent external-memory search structure that supports updates and queries across versioned histories. It introduces a buffering approach combined with a geometric view of persistence to replicate the external-memory performance of the ephemeral $B^{\varepsilon}$-tree in a persistent setting. The main results provide amortized I/O bounds $O\left( \frac{1}{\varepsilon B^{1-\varepsilon}} \log_{B} N_v \right)$ for insertions/deletions and $O\left( \frac{1}{\varepsilon} \log_{B} N_v + K/B \right)$ for successor/range queries, with linear space in the total number of updates; worst-case variants are achievable under $M = \Omega\left( B^{1-\varepsilon} \log_2(\max_v N_v) \right)$. Compared to prior work, this provides buffering with persistence, improving memory requirements relative to the ephemeral external-memory dictionary by Das, Iacono, Nekrich, and matching the worst-case bounds of related partially-persistent structures, while retaining optimal I/O performance.

Abstract

We present an optimal partially-persistent external-memory search tree with amortized I/O bounds matching those achieved by the non-persistent $B^{\varepsilon}$-tree by Brodal and Fagerberg [SODA 2003]. In a partially-persistent data structure each update creates a new version of the data structure, where all past versions can be queried, but only the current version can be updated. All operations should be efficient with respect to the size $N_v$ of the accessed version $v$. For any parameter $0<\varepsilon<1$, our data structure supports insertions and deletions in amortized $O\!\left(\frac{1}{\varepsilon B^{1-\varepsilon}}\log_B N_v\right)$ I/Os, where $B$ is the external-memory block size. It also supports successor and range reporting queries in amortized $O\!\left(\frac{1}{\varepsilon}\log_B N_v+K/B\right)$ I/Os, where $K$ is the number of values reported. The space usage of the data structure is linear in the total number of updates. We make the standard and minimal assumption that the internal memory has size $M \geq 2B$. The previous state-of-the-art external-memory partially-persistent search tree by Arge, Danner and Teh [JEA 2003] supports all operations in worst-case $O\!\left(\log_B N_v+K/B\right)$ I/Os, matching the bounds achieved by the classical B-tree by Bayer and McCreight [Acta Informatica 1972]. Our data structure successfully combines buffering updates with partial persistence. The I/O bounds can also be achieved in the worst-case sense, by slightly modifying our data structure and under the requirement that the memory size $M = Ω\!\left(B^{1-\varepsilon}\log_2(\max_v N_v)\right)$. The worst-case result slightly improves the memory requirement over the previous ephemeral external-memory dictionary by Das, Iacono, and Nekrich (ISAAC 2022), who achieved matching worst-case I/O bounds but required $M=Ω\!\left(B\log_B N\right)$.

Buffered Partially-Persistent External-Memory Search Trees

TL;DR

The paper addresses maintaining a buffered partially-persistent external-memory search structure that supports updates and queries across versioned histories. It introduces a buffering approach combined with a geometric view of persistence to replicate the external-memory performance of the ephemeral -tree in a persistent setting. The main results provide amortized I/O bounds for insertions/deletions and for successor/range queries, with linear space in the total number of updates; worst-case variants are achievable under . Compared to prior work, this provides buffering with persistence, improving memory requirements relative to the ephemeral external-memory dictionary by Das, Iacono, Nekrich, and matching the worst-case bounds of related partially-persistent structures, while retaining optimal I/O performance.

Abstract

We present an optimal partially-persistent external-memory search tree with amortized I/O bounds matching those achieved by the non-persistent -tree by Brodal and Fagerberg [SODA 2003]. In a partially-persistent data structure each update creates a new version of the data structure, where all past versions can be queried, but only the current version can be updated. All operations should be efficient with respect to the size of the accessed version . For any parameter , our data structure supports insertions and deletions in amortized I/Os, where is the external-memory block size. It also supports successor and range reporting queries in amortized I/Os, where is the number of values reported. The space usage of the data structure is linear in the total number of updates. We make the standard and minimal assumption that the internal memory has size . The previous state-of-the-art external-memory partially-persistent search tree by Arge, Danner and Teh [JEA 2003] supports all operations in worst-case I/Os, matching the bounds achieved by the classical B-tree by Bayer and McCreight [Acta Informatica 1972]. Our data structure successfully combines buffering updates with partial persistence. The I/O bounds can also be achieved in the worst-case sense, by slightly modifying our data structure and under the requirement that the memory size . The worst-case result slightly improves the memory requirement over the previous ephemeral external-memory dictionary by Das, Iacono, and Nekrich (ISAAC 2022), who achieved matching worst-case I/O bounds but required .

Paper Structure

This paper contains 3 sections.