Table of Contents
Fetching ...

Cache-Oblivious Representation of B-Tree Structures

Lukáš Ondráček, Ondřej Mička

TL;DR

The complexity of depth-first search of a subtree and contiguous memory area is shown and the relationship between positions of vertices in tree and in memory is provided to provide better insight into the relationship between positions of vertices in tree and in memory.

Abstract

We propose a general data structure CORoBTS for storing B-tree-like search trees dynamically in a cache-oblivious way combining the van Emde Boas memory layout with packed memory array. In the use of the vEB layout mostly search complexity was considered, so far. We show the complexity of depth-first search of a subtree and contiguous memory area and provide better insight into the relationship between positions of vertices in tree and in memory. We describe how to build an arbitrary tree in vEB layout if we can simulate its depth-first search. Similarly, we examine batch updates of packed memory array. In CORoBTS, the stored search tree has to satisfy that all leaves are at the same depth and vertices have arity between the chosen constants $a$ and $b$. The data structure allows searching with an optimal I/O complexity $\mathcal{O}(\log_B{N})$ and is stored in linear space. It provides operations for inserting and removing a subtree; both have an amortized I/O complexity $\mathcal{O}(S\cdot(\log^2 N)/B + \log_B N\cdot\log\log S + 1)$ and amortized time complexity $\mathcal{O}(S\cdot\log^2 N)$, where $S$ is the size of the subtree and $N$ the size of the whole stored tree. Rebuilding an existing subtree saves the multiplicative $\mathcal{O}(\log^2 N)$ in both complexities if the number of vertices on individual tree levels is not changed; it is paid only for the inserted/removed vertices otherwise. Modifying cache-oblivious partially persistent array proposed by Davoodi et al. [ESA, pages 296-308. Springer, 2014] to use CORoBTS improves its space complexity from $\mathcal{O}(U^{\log_2 3} + V \log U)$ to $\mathcal{O}(U + V \log U)$, where $U$ is the maximal size of the array and $V$ is the number of versions; the data locality and I/O complexity of both present and persistent reads are kept unchanged; I/O complexity of writes is worsened by a polylogarithmic factor.

Cache-Oblivious Representation of B-Tree Structures

TL;DR

The complexity of depth-first search of a subtree and contiguous memory area is shown and the relationship between positions of vertices in tree and in memory is provided to provide better insight into the relationship between positions of vertices in tree and in memory.

Abstract

We propose a general data structure CORoBTS for storing B-tree-like search trees dynamically in a cache-oblivious way combining the van Emde Boas memory layout with packed memory array. In the use of the vEB layout mostly search complexity was considered, so far. We show the complexity of depth-first search of a subtree and contiguous memory area and provide better insight into the relationship between positions of vertices in tree and in memory. We describe how to build an arbitrary tree in vEB layout if we can simulate its depth-first search. Similarly, we examine batch updates of packed memory array. In CORoBTS, the stored search tree has to satisfy that all leaves are at the same depth and vertices have arity between the chosen constants and . The data structure allows searching with an optimal I/O complexity and is stored in linear space. It provides operations for inserting and removing a subtree; both have an amortized I/O complexity and amortized time complexity , where is the size of the subtree and the size of the whole stored tree. Rebuilding an existing subtree saves the multiplicative in both complexities if the number of vertices on individual tree levels is not changed; it is paid only for the inserted/removed vertices otherwise. Modifying cache-oblivious partially persistent array proposed by Davoodi et al. [ESA, pages 296-308. Springer, 2014] to use CORoBTS improves its space complexity from to , where is the maximal size of the array and is the number of versions; the data locality and I/O complexity of both present and persistent reads are kept unchanged; I/O complexity of writes is worsened by a polylogarithmic factor.
Paper Structure